AI Music Detection

AI music detection determines whether a clip contains AI-generated music. Each window is classified by its vocal and instrumental content, then aggregated into a clip-level primary_verdict of ai-vocal-music, ai-instrumental, or not-ai-music.

This is distinct from music detection, which classifies audio as music, speech, or neither. AI music detection answers a different question: is this music AI-generated?

	Batch	Streaming
Use case	Classify a complete audio file	Real-time per-window classification
Protocol	HTTP POST	WebSocket
Output	Clip-level verdict plus per-window breakdown	Per-window vocal AI results emitted progressively, final clip-level summary on completion
Instrumental AI detection	Included per window and clip-level	Clip-level only, in the final `done` message

For a side-by-side comparison with the other Modulate capabilities, see Which API should I use?.

Authentication

Batch uses the X-API-Key header. Streaming uses an api_key query parameter at connection time. See Authentication and rate limits.

Performance notes

Per-window results can be less accurate than the clip-level verdict and its confidence. Rely on the clip-level result when judging a whole song or segment.
Heavily processed or high-production tracks are sometimes mislabeled as AI-generated. This is a known gap targeted by future model updates.

Music Detection Streaming

AI Music Detection Batch

⌘I

Velma

Speech-to-text Transcription

Deepfake Detection

Emotion Detection

Accent Detection

PII/PHI Redaction

Music Detection