Skip to main content
POST
/
api
/
velma-2-stt-batch-english-vfast
Transcribe English audio file
curl --request POST \
  --url https://modulate-developer-apis.com/api/velma-2-stt-batch-english-vfast \
  --header 'Content-Type: multipart/form-data' \
  --header 'X-API-Key: <api-key>' \
  --form upload_file='@example-file'
{
  "text": "Good morning, everyone.",
  "duration_ms": 2000
}

Documentation Index

Fetch the complete documentation index at: https://docs.modulate.ai/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

X-API-Key
string
header
required

API key for authentication. Your API key must be included in the X-API-Key header for all requests. API keys are tied to your organization and determine your access to models and usage limits.

Body

multipart/form-data
upload_file
file
required

Audio file to transcribe. Supported formats: AAC, AIFF, FLAC, MP3, MP4, MOV, OGG, Opus, WAV, WebM. Maximum file size: 100MB. Empty files are rejected.

Content-Type Requirement: The MIME type for this part SHOULD match the audio format being uploaded. Using application/octet-stream is strongly discouraged — the server uses the content type to select the correct audio decoder, and generic binary types may cause intermittent decoding failures or empty responses.

Correct MIME types by format:

  • AAC: audio/aac
  • AIFF: audio/aiff
  • FLAC: audio/flac
  • MP3: audio/mpeg
  • MP4: video/mp4
  • MOV: video/quicktime
  • OGG: audio/ogg
  • Opus: audio/opus
  • WAV: audio/wav
  • WebM: video/webm

Response

Transcription completed successfully

text
string
required

The complete transcribed text from the audio file. Text includes automatic capitalization and punctuation. Multiple sentences are separated by appropriate punctuation marks.

Example:

"Hello, this is a sample transcription. It includes proper capitalization and punctuation."

duration_ms
integer<int32>
required

The total duration of the processed audio in milliseconds. This value represents the actual audio duration and is used for usage tracking and billing purposes.

Required range: x >= 0
Example:

12500