How to authenticate Modulate API requests, what rate limits apply, and how to handle auth and rate limit errors.
Modulate exposes two API surfaces with different authentication schemes. Most users will use the Models API; the Modulate Platform API is a higher-level orchestration surface covered at the bottom of this page.
All Models API endpoints require authentication via an API key. This section covers how to pass your key correctly for each endpoint type, what errors to expect when authentication fails, and how rate limiting works.
API keys can be generated in the web interface.API keys are tied to your organization and determine your access to models and your usage limits. These settings cannot be changed once an API key is generated.
Valid key, but model access is not enabled for your organization, or a usage limit has been exceeded
The Deepfake Detection batch endpoint returns 403 for both unauthorized keys and exceeded usage limits, rather than a separate 401. Check the detail field in the error response body to distinguish the cause.
Two independent limits apply to each endpoint per organization:Concurrent request limit — the number of requests or connections that can be in-flight simultaneously. Submitting a new request beyond this limit results in an immediate error rather than queuing.Monthly usage limit — measured in audio hours processed. Once the monthly limit is reached, further requests are rejected until the limit resets.Limit values are set per organization and per model. Contact your administrator to review or raise your limits.
Concurrent limit hit — wait a short interval (a few seconds) and retry. The limit frees up as in-flight requests complete.
Monthly limit hit — no retry will succeed until the monthly period resets. Contact your administrator or update this in Organization settings.
For batch workloads where you control concurrency (e.g., processing a large file backlog), use a semaphore or connection pool to stay within your concurrent limit rather than relying on retry loops.
Python semaphore pattern
import asyncioimport aiohttpMAX_CONCURRENT = 5 # set to your organization's concurrent limitsemaphore = asyncio.Semaphore(MAX_CONCURRENT)async def transcribe_file(session, filepath): async with semaphore: # ... your request here pass
The Modulate Platform API (cloud-processing-api.modulate.ai) is the orchestration surface for submitting jobs that combine transcription with optional analysis features (emotion, demographics, deepfake detection, behavioral insights). It uses a different auth scheme from the Models API.
The Platform API is currently in early access. Significant changes are planned before 1.0.0 — expect any part of the contract to change. See the API Reference → Modulate Platform API tab for the current schema.
The Platform API supports three job submission patterns through a single POST /api_service endpoint:
Real-time WebSocket — submit with submission_type: "realtime_websocket" and no files. The response returns a realtime_url for streaming via the Pipecat Client SDK.
Single-file batch — upload one audio file (≤ 5 MB) in a single POST.
Multi-file batch — POST each file in turn with the same job_id; set finalize_job: true on the final upload.
For all three patterns, poll GET /api_service/job_status/{job_id} until status='completed', then retrieve results from the response.See the API Reference → Modulate Platform API tab for full request/response schemas.