Speech-to-Text Transcription Batch English VFast

POST

api

velma-2-stt-batch-english-vfast

Transcribe an English audio file

curl --request POST \
  --url https://platform.modulate.ai/api/velma-2-stt-batch-english-vfast \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key: <api-key>' \
  --form upload_file='@example-file'

{
  "text": "Hello, how are you doing today?",
  "duration_ms": 2000
}

Authorizations

X-API-Key

string

header

required

API key used for authentication and usage tracking.

Body

multipart/form-data

upload_file

file

required

Audio file to transcribe. Must be non-empty. Supported formats: .aac, .aiff, .flac, .mov, .mp3, .mp4, .ogg, .opus, .wav, .webm. Maximum file size: 100 MB.

Response

Transcription completed successfully.

text

string

required

The complete transcribed text from the audio file. Text includes automatic capitalization and punctuation. May be an empty string if no speech was recognized.

Example:

"Hello, how are you doing today?"

duration_ms

integer

required

The total duration of the processed audio in milliseconds.

Required range: x >= 0

Example:

14253

Speech-to-Text Transcription Batch Multilingual

Speech-to-Text Transcription Streaming Multilingual

⌘I

Transcribe an English audio file

curl --request POST \
  --url https://platform.modulate.ai/api/velma-2-stt-batch-english-vfast \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key: <api-key>' \
  --form upload_file='@example-file'

{
  "text": "Hello, how are you doing today?",
  "duration_ms": 2000
}