OpenAI logoWhisper Large V3 (best performance)

Access our most performant Whisper implementations for high-throughput production workloads.

Deploy Whisper Large V3 (best performance) behind an API endpoint in seconds.

Deploy model

Example usage

Transcribe audio files at up to a 1000x real-time factor — that's 1 hour of audio in under 4 seconds. This setup requires meaningful production traffic to be cost-effective, but at scale, it's at least 80% cheaper than OpenAI.

Get in touch with us and we'll work with you to deploy a transcription pipeline that's customized to match your needs. For quick deployments of Whisper suitable for shorter audio files and lower traffic volume, you can deploy Whisper V3 and Whisper V3 Turbo directly from the model library.

For more details about the inference API, please refer to our documentation.

Input
1import requests
2import os
3
4# Model ID for production deployment
5model_id = ""
6# Read secrets from environment variables
7baseten_api_key = os.environ["BASETEN_API_KEY"]
8
9# Call model endpoint
10resp = requests.post(
11    f"https://model-{model_id}.api.baseten.co/production/predict",
12    headers={"Authorization": f"Api-Key {baseten_api_key}"},
13    json={
14        "whisper_input": {
15            "audio": {
16                "url": "https://cdn.baseten.co/docs/production/Gettysburg.mp3"
17            }
18        }
19    },
20)
21
22print(resp.json())
JSON output
1{
2    "segments": [
3        {
4            "start_time": 0.768,
5            "end_time": 11.520000000000001,
6            "text": "four score and seven years ago our fathers brought forth upon this continent a new nation conceived in liberty and dedicated to the proposition that all men are created equal",
7            "log_prob": -1.7316513061523438,
8            "word_timestamps": []
9        }
10    ],
11    "language_code": "en",
12    "language_prob": null
13}

Deploy any model in just a few commands

Avoid getting tangled in complex deployment processes. Deploy best-in-class open-source models and take advantage of optimized serving for your own models.

$

truss init -- example stable-diffusion-2-1-base ./my-sd-truss

$

cd ./my-sd-truss

$

export BASETEN_API_KEY=MdNmOCXc.YBtEZD0WFOYKso2A6NEQkRqTe

$

truss push

INFO

Serializing Stable Diffusion 2.1 truss.

INFO

Making contact with Baseten 👋 👽

INFO

🚀 Uploading model to Baseten 🚀

Upload progress: 0% | | 0.00G/2.39G