Skip to main content
Velma-2 language detection identifies the spoken language of an audio file and returns a confidence score alongside the result. It analyzes up to the first 30 seconds of audio and responds synchronously — one POST in, one JSON response out.
Only the first 30 seconds of audio are analyzed. Longer files are accepted but the additional audio is ignored. For best results, ensure at least 3–5 seconds of clear speech in the first 30 seconds.

Make a request

curl -X POST https://modulate-developer-apis.com/api/velma-2-language-detection-batch \
  -H "X-API-Key: $MODULATE_API_KEY" \
  -F "upload_file=@audio.mp3"
{
  "predicted_language": "English",
  "predicted_language_code": "en",
  "confidence": 0.9847,
  "duration_ms": 14253
}
FieldTypeDescription
predicted_languagestringHuman-readable language name — suitable for display to end users.
predicted_language_codestringLowercase ISO 639-1 code (e.g. "en", "fr", "zh") — suitable for routing or locale switching.
confidencefloatProbability for the predicted language, 0.0–1.0. Higher means more certain.
duration_msintegerTotal audio duration in milliseconds. Only the first 30 seconds are analyzed regardless of this value.

Working with confidence scores

The confidence field is a probability — values close to 1.0 mean the model is highly certain, values close to 0.0 mean it could not commit to any language. A common pattern is to set a threshold below which you fall back to a default behavior:
result = response.json()

if result["confidence"] < 0.5:
    # Low confidence — fall back to a default or prompt the user
    handle_unknown_language()
else:
    route_to_language_pipeline(result["predicted_language_code"])
There is no universally correct threshold — tune it based on your application’s tolerance for misclassification. If an audio file contains a language outside the supported set, the model returns the closest supported match, typically with low confidence.

Supported languages

100 spoken languages are recognized: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba.

API reference