# Modulate

> Build with the modulate.ai models — real-time speech-to-text, synthetic voice detection, and PII/PHI redaction at scale.

## Docs

- [PII/PHI Redaction Batch](https://docs.modulate.ai/api-reference/redaction/batch.md): Transcribe a pre-recorded audio file and redact PII/PHI from both the transcript text and the returned audio.
- [PII/PHI Redaction](https://docs.modulate.ai/api-reference/redaction/overview.md): Velma-2 PII/PHI redaction — transcribe audio, replace sensitive spans with entity-type tags, and silence the matching audio ranges.
- [PII/PHI Redaction Streaming](https://docs.modulate.ai/api-reference/redaction/streaming.md): Real-time PII/PHI redaction over WebSocket — receive a redacted transcript and a redacted MP3 clip per utterance.
- [Speech-to-Text Transcription Batch Multilingual](https://docs.modulate.ai/api-reference/stt/batch.md): Multilingual batch transcription with automatic language detection, speaker diarization, emotion and accent detection, and PII/PHI tagging.
- [Speech-to-Text Transcription Batch English VFast](https://docs.modulate.ai/api-reference/stt/batch-english-vfast.md): Fast English-only batch transcription. Trades enrichment features for the lowest possible turnaround.
- [Speech-to-text Transcription](https://docs.modulate.ai/api-reference/stt/overview.md): Velma-2 speech-to-text APIs — multilingual batch transcription, fast English-only batch, and real-time streaming over WebSocket.
- [Speech-to-Text Transcription Streaming Multilingual](https://docs.modulate.ai/api-reference/stt/streaming.md): Real-time speech-to-text over WebSocket, with optional speaker diarization, emotion detection, accent detection, and PII/PHI tagging.
- [Deepfake Detection Batch](https://docs.modulate.ai/api-reference/svd/batch.md): Detect synthetic (AI-generated) voice in a pre-recorded audio file. Returns per-frame deepfake scores.
- [Deepfake Detection](https://docs.modulate.ai/api-reference/svd/overview.md): Velma-2 synthetic voice detection — deepfake detection on pre-recorded files (batch) or live audio (streaming).
- [Deepfake Detection Streaming](https://docs.modulate.ai/api-reference/svd/streaming.md): Real-time deepfake detection over WebSocket, with per-frame verdicts and confidence scores delivered as analysis windows complete.
- [FAQ](https://docs.modulate.ai/faq.md): Frequently asked questions about authentication, models, audio formats, pricing, rate limits, streaming, errors, privacy, and support.
- [Audio formats and preprocessing](https://docs.modulate.ai/guides/audio-formats.md): Supported audio formats across all Velma-2 endpoints, with guidance on format selection, conversion, and the special requirements of the streaming SVD endpoint.
- [Authentication and rate limits](https://docs.modulate.ai/guides/authentication.md): How to authenticate Modulate API requests, what rate limits apply, and how to handle auth and rate limit errors.
- [Code examples by language](https://docs.modulate.ai/guides/code-examples.md): Working integration patterns in cURL, Python (sync, async, concurrent), and JavaScript / Node.js with WebSocket support.
- [STT enrichment features](https://docs.modulate.ai/guides/stt-enrichment-features.md): Optional metadata you can request alongside transcription — speaker diarization, emotion, accent, PII/PHI tagging, and synthetic voice scoring.
- [How synthetic voice detection works](https://docs.modulate.ai/guides/synthetic-voice-detection.md): The mechanics behind Velma-2's synthetic voice detection — windowing, silence trimming, confidence scoring, and the no-content verdict.
- [Troubleshooting](https://docs.modulate.ai/guides/troubleshooting.md): Common errors organized by category, with causes and fixes — auth, rate limits, audio validation, timeouts, and server errors.
- [Which API should I use?](https://docs.modulate.ai/guides/which-api.md): Pick the right Velma-2 endpoint based on your latency needs, language requirements, audio format constraints, and required features.
- [Modulate developer docs](https://docs.modulate.ai/index.md): Build with the Velma-2 API — real-time speech-to-text, synthetic voice detection, and PII/PHI redaction at scale.
- [Quick start](https://docs.modulate.ai/quickstart.md): Get to your first successful API call for each Velma-2 model in about 10 minutes per section.
- [Support](https://docs.modulate.ai/support.md): How to reach the Modulate team for technical questions, bug reports, feature requests, or limit increases.

## OpenAPI Specs

- [velma-2-synthetic-voice-detection-batch-openapi](https://docs.modulate.ai/api/velma-2-synthetic-voice-detection-batch-openapi.yaml)
- [velma-2-stt-batch-openapi](https://docs.modulate.ai/api/velma-2-stt-batch-openapi.yaml)
- [velma-2-stt-batch-english-vfast-openapi](https://docs.modulate.ai/api/velma-2-stt-batch-english-vfast-openapi.yaml)
- [velma-2-pii-phi-redaction-batch-openapi](https://docs.modulate.ai/api/velma-2-pii-phi-redaction-batch-openapi.yaml)
- [openapi](https://docs.modulate.ai/api/openapi.json)

## AsyncAPI Specs

- [velma-2-synthetic-voice-detection-streaming-openapi](https://docs.modulate.ai/api/velma-2-synthetic-voice-detection-streaming-openapi.yaml)