Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.modulate.ai/llms.txt

Use this file to discover all available pages before exploring further.

Velma-2 offers three speech-to-text endpoints. Pick the one that matches your latency, language, and feature needs.
Batch (multilingual)Batch English VFastStreaming
Use caseTranscription with rich metadataFast English-only transcriptionReal-time transcription
ProtocolHTTP POSTHTTP POSTWebSocket
LanguagesMultilingualEnglish onlyMultilingual
Speaker diarization
Emotion / accent detection
PII/PHI tagging
For a side-by-side comparison with the other Velma-2 capabilities, see Which API should I use?.

Authentication

Batch endpoints use the X-API-Key header. Streaming uses an api_key query parameter at connection time. See Authentication and rate limits.