Skip to main content
It analyzes audio signals alongside the words to surface behaviors and risks in voice conversations — fraud, customer churn, compliance violations, and more. Configure it with 150+ pre-built behaviors or define your own in plain language, either the built-in default or a JSON BatchConfig — and behaviors can be pulled from a catalog of ready-made presets.
BatchStreaming
Use caseAnalyze a complete recordingAnalyze a live conversation in real time
ProtocolHTTP POST (multipart upload)WebSocket
Configurationconfig form field — default or a JSON BatchConfigFirst text frame — default or a JSON BatchConfig
OutputFull BatchResponse after processingDiscrete events emitted as results are produced
Max file size100 MB— (streaming)
Transcription + diarization
Conversation-type & participant-role inference
Behavior detection (with presets)
Topics, topic sentiment, summary
For a side-by-side comparison with the other Velma-2 capabilities, see Which API should I use?.

Configuration

Both endpoints take the same configuration: either the literal string default to use the built-in configuration, or a JSON BatchConfig describing the conversation types, participant roles, behaviors, STT options, and which aggregate outputs (topics, sentiments, summary) to produce. The full BatchConfig schema is rendered on the Batch reference. Behaviors can be specified inline or referenced from a catalog of presets using the preset:<identifier> syntax. List the available presets with List behavior presets.

Authentication

Batch uses the X-API-Key header. Streaming uses an api_key query parameter at connection time. See Authentication and rate limits.