Language Detection

Language Detection identifies the spoken language of an audio file and returns a confidence score alongside the result. It analyzes up to the first 30 seconds of audio and responds synchronously — one POST in, one JSON response out.

Only the first 30 seconds of audio are analyzed. Longer files are accepted but the additional audio is ignored. For best results, ensure at least 3–5 seconds of clear speech in the first 30 seconds.

Make a request

curl -X POST https://platform.modulate.ai/api/velma-2-language-detection-batch \
  -H "X-API-Key: $MODULATE_API_KEY" \
  -F "upload_file=@audio.mp3"

import os, requests

response = requests.post(
    "https://platform.modulate.ai/api/velma-2-language-detection-batch",
    headers={"X-API-Key": os.environ["MODULATE_API_KEY"]},
    files={"upload_file": open("audio.mp3", "rb")},
)
response.raise_for_status()
result = response.json()

print(f"Language: {result['predicted_language']} ({result['predicted_language_code']})")
print(f"Confidence: {result['confidence']:.4f}")

import fs from "fs";
import FormData from "form-data";

const form = new FormData();
form.append("upload_file", fs.createReadStream("audio.mp3"), { filename: "audio.mp3" });

const response = await fetch(
  "https://platform.modulate.ai/api/velma-2-language-detection-batch",
  {
    method: "POST",
    headers: { "X-API-Key": process.env.MODULATE_API_KEY, ...form.getHeaders() },
    body: form,
  }
);

const result = await response.json();
console.log(`Language: ${result.predicted_language} (${result.predicted_language_code})`);
console.log(`Confidence: ${result.confidence.toFixed(4)}`);

Expected response

{
  "predicted_language": "English",
  "predicted_language_code": "en",
  "confidence": 0.9847,
  "duration_ms": 14253
}

Field	Type	Description
`predicted_language`	string	Human-readable language name — suitable for display to end users.
`predicted_language_code`	string	Lowercase ISO 639-1 code (e.g. `"en"`, `"fr"`, `"zh"`) — suitable for routing or locale switching.
`confidence`	float	Probability for the predicted language, 0.0–1.0. Higher means more certain.
`duration_ms`	integer	Total audio duration in milliseconds. Only the first 30 seconds are analyzed regardless of this value.

Working with confidence scores

The confidence field is a probability — values close to 1.0 mean the model is highly certain, values close to 0.0 mean it could not commit to any language. A common pattern is to set a threshold below which you fall back to a default behavior:

result = response.json()

if result["confidence"] < 0.5:
    # Low confidence — fall back to a default or prompt the user
    handle_unknown_language()
else:
    route_to_language_pipeline(result["predicted_language_code"])

There is no universally correct threshold — tune it based on your application’s tolerance for misclassification. If an audio file contains a language outside the supported set, the model returns the closest supported match, typically with low confidence.

Supported languages

100 spoken languages are recognized: Afrikaans, Albanian, Amharic, Arabic, Armenian, Assamese, Azerbaijani, Bashkir, Basque, Belarusian, Bengali, Bosnian, Breton, Bulgarian, Cantonese, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Faroese, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Lao, Latin, Latvian, Lingala, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Myanmar, Nepali, Norwegian, Nynorsk, Occitan, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Sanskrit, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Spanish, Sundanese, Swahili, Swedish, Tagalog, Tajik, Tamil, Tatar, Telugu, Thai, Tibetan, Turkish, Turkmen, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, Yiddish, Yoruba.

API reference

Language Detection Batch — full parameter and response schema

Get started

By capability

Guides

Language Detection

Make a request

Working with confidence scores

Supported languages

API reference

​Make a request

​Working with confidence scores

​Supported languages

​API reference

Make a request

Working with confidence scores

Supported languages

API reference