PII/PHI Redaction

Modulate offers two approaches to PII/PHI handling depending on what you need:

	PII/PHI tagging	PII/PHI Redaction
How to use	`pii_phi_tagging=true` on any Transcription endpoint	Dedicated Redaction API
Transcript	Sensitive spans wrapped in entity tags	Sensitive spans replaced with empty marker tags
Audio	Original audio unchanged	Sensitive audio ranges silenced
Use when	You need the transcript cleaned but keep audio	You need both transcript and audio sanitized

The examples on this page use the Redaction APIs.

Batch

Send a complete audio file. The response is multipart/form-data with two parts: metadata (JSON transcript) and audio (redacted MP3).

curl -X POST https://platform.modulate.ai/api/velma-2-pii-phi-redaction-batch \
  -H "X-API-Key: $MODULATE_API_KEY" \
  -F "upload_file=@audio.mp3" \
  -F "speaker_diarization=true" \
  -o response.multipart \
  -D response_headers.txt

import os, json, requests
from requests_toolbelt.multipart.decoder import MultipartDecoder

response = requests.post(
    "https://platform.modulate.ai/api/velma-2-pii-phi-redaction-batch",
    headers={"X-API-Key": os.environ["MODULATE_API_KEY"]},
    data={"speaker_diarization": "true"},
    files={"upload_file": open("audio.mp3", "rb")},
)
response.raise_for_status()

decoder = MultipartDecoder.from_response(response)
for part in decoder.parts:
    disposition = part.headers.get(b"Content-Disposition", b"").decode()
    if 'name="metadata"' in disposition:
        metadata = json.loads(part.content)
        print(metadata["text"])
    elif 'name="audio"' in disposition:
        with open("redacted.mp3", "wb") as f:
            f.write(part.content)
        print(f"Saved redacted audio ({len(part.content)} bytes)")

Install requests-toolbelt to decode the multipart response: pip install requests-toolbelt.

Expected metadata response

{
  "text": "My name is <pii:name></pii:name> and my SSN is <pii:ssn></pii:ssn>.",
  "duration_ms": 5600,
  "redaction_ranges": [[1100, 1800], [3200, 4400]],
  "utterances": [
    {
      "utterance_uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
      "start_ms": 0,
      "duration_ms": 5600,
      "speaker": 1,
      "language": "en",
      "text": "My name is <pii:name></pii:name> and my SSN is <pii:ssn></pii:ssn>."
    }
  ]
}

Marker tags in the transcript (<pii:name></pii:name>, <pii:ssn></pii:ssn>, <phi></phi>, etc.) correspond directly to the silenced redaction_ranges in the audio. Each detected span is replaced with an empty marker tag: <phi></phi> for health information and <pii:CATEGORY></pii:CATEGORY> for personal information, with the surrounding text preserved. Batch accepts common audio formats (MP3, WAV, FLAC, MP4, OGG, and more) — see Audio formats.

Streaming (WebSocket)

Connect over WebSocket and receive redacted utterances and silenced MP3 clips as each utterance completes. The stream delivers two message types interleaved:

JSON text frames — utterance messages with the redacted transcript text
Binary frames — MP3 clips with the silenced audio for each utterance

websocat "wss://platform.modulate.ai/api/velma-2-pii-phi-redaction-streaming?api_key=$MODULATE_API_KEY&speaker_diarization=true" \
  --binary - < audio.mp3

import os, asyncio, json, websockets

API_KEY = os.environ["MODULATE_API_KEY"]
AUDIO_FILE = "audio.mp3"
CHUNK_SIZE = 4096

async def redact():
    url = (
        f"wss://platform.modulate.ai/api/velma-2-pii-phi-redaction-streaming"
        f"?api_key={API_KEY}&speaker_diarization=true"
    )
    audio_clips = []

    async with websockets.connect(url) as ws:
        async def send():
            with open(AUDIO_FILE, "rb") as f:
                while chunk := f.read(CHUNK_SIZE):
                    await ws.send(chunk)
            await ws.send("")

        async def receive():
            is_done = False
            async for message in ws:
                if isinstance(message, bytes):
                    audio_clips.append(message)
                    if is_done:
                        break
                    continue
                msg = json.loads(message)
                if msg["type"] == "utterance":
                    u = msg["utterance"]
                    print(f"[{u['start_ms']}ms] Speaker {u['speaker']}: {u['text']}")
                elif msg["type"] == "done":
                    is_done = True
                    if not msg.get("trailing_redacted_audio"):
                        break
                elif msg["type"] == "error":
                    raise RuntimeError(msg["error"])

        await asyncio.gather(send(), receive())

    if audio_clips:
        with open("redacted.mp3", "wb") as f:
            for clip in audio_clips:
                f.write(clip)

asyncio.run(redact())

Example messages received

{ "type": "utterance", "utterance": { "utterance_uuid": "a1b2c3d4-e5f6-7890-abcd-ef1234567890", "start_ms": 0, "duration_ms": 5600, "speaker": 1, "language": "en", "text": "My name is <pii:name></pii:name> and my SSN is <pii:ssn></pii:ssn>." }, "redacted_audio": { "start_ms": 0, "duration_ms": 5600 } }
<binary MP3 clip>
{ "type": "done", "duration_ms": 5600, "trailing_redacted_audio": null }

API reference

PII/PHI Redaction Batch — full parameter and response schema
PII/PHI Redaction Streaming — WebSocket protocol, message format, close codes

Get started

By capability

Guides

PII/PHI Redaction

Batch

Streaming (WebSocket)

API reference

​Batch

​Streaming (WebSocket)

​API reference

Batch

Streaming (WebSocket)

API reference