Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.modulate.ai/llms.txt

Use this file to discover all available pages before exploring further.

Modulate’s Velma-2 platform delivers production-grade voice AI: multilingual transcription, synthetic-voice detection, and PII/PHI redaction — over both REST and WebSocket.

Quick start

Get your first API call working in minutes with working Python examples for every endpoint.

Guides

How-to articles covering audio formats, authentication, enrichment features, and choosing the right API.

API reference

Full request and response schemas for every endpoint — STT batch, streaming, deepfake detection, and redaction.

FAQ

Authentication, pricing, audio requirements, rate limits, privacy, and data handling.

Support

Get help, report a bug, or request a feature — we respond within one business day.

What you can build

Transcription

Multilingual speech-to-text with speaker diarization, emotion, accent, and PII/PHI tagging.

Real-time captions

Stream audio over WebSocket and receive utterances as they’re spoken.

Deepfake detection

Per-frame synthetic voice detection — batch and streaming.

PII/PHI redaction

Replace sensitive spans in transcripts and silence the matching audio ranges.

Voice authentication

Real-time anti-spoofing checks during a voice login flow.

Compliance archives

Shareable recordings with sensitive content removed — text and audio.