Velma-2 offers three speech-to-text endpoints. Pick the one that matches your latency, language, and feature needs.Documentation Index
Fetch the complete documentation index at: https://docs.modulate.ai/llms.txt
Use this file to discover all available pages before exploring further.
| Batch (multilingual) | Batch English VFast | Streaming | |
|---|---|---|---|
| Use case | Transcription with rich metadata | Fast English-only transcription | Real-time transcription |
| Protocol | HTTP POST | HTTP POST | WebSocket |
| Languages | Multilingual | English only | Multilingual |
| Speaker diarization | ✓ | — | ✓ |
| Emotion / accent detection | ✓ | — | ✓ |
| PII/PHI tagging | ✓ | — | ✓ |
Authentication
Batch endpoints use theX-API-Key header. Streaming uses an api_key query parameter at connection time. See Authentication and rate limits.