Skip to main content
Velma-2 music detection classifies audio as music, speech, or neither, returning frame-level probabilities across the clip.
BatchStreaming
Use caseClassify a complete audio fileReal-time frame-by-frame classification
ProtocolHTTP POSTWebSocket
OutputFull response after processingFrames emitted progressively as audio arrives
LatencyProportional to file length~192ms per frame
For a side-by-side comparison with the other Velma-2 capabilities, see Which API should I use?.

Authentication

Batch uses the X-API-Key header. Streaming uses an api_key query parameter at connection time. See Authentication and rate limits.