Real-time trust & safety for live social audio

Voice chat in games, social apps, and live audio platforms generates more audio than any trust and safety team can review. The incidents that cause the most harm — harassment, hate speech, and child safety violations — happen in real time and require real-time detection to be actionable. This playbook deploys Velma’s full safety behavior catalog against a live audio stream, adds platform-specific custom behaviors on top, and shows how to route detections to the right response — automated mute, human escalation, or incident logging.

What this configuration detects

Identity-based harm

Preset	What it catches
`misogyny`	Language propagating negative stereotypes about gender identity or presentation
`racism`	Language targeting racial, national, or ethnic identity
`homophobia`	Language marginalizing people based on sexuality
`transphobia`	Language marginalizing people based on transgender identity
`xenophobia`	Hostility or exclusion based on nationality, culture, or religion
`ableism`	Marginalization based on disability, impairment, or neurodivergence
`sizeism`	Stigmatization based on body size or weight

Harassment and violence

Preset	What it catches
`harassment`	Targeted, repeated hostile behavior toward a participant
`hate`	Broad hate speech not captured by the identity-specific behaviors
`hateful-or-violent-ideology-propagation`	Advocacy for hateful or violent ideologies
`sexual-harassment`	Unwanted sexualized speech or advances
`violent-graphic-material`	Graphic descriptions of violence
`sexually-graphic-material`	Explicit sexual content

Welfare and child safety

Preset	What it catches
`child-safety-violation`	Content or behavior that puts minors at risk
`suicidal-and-self-injurious-ideation`	Expression of suicidal or self-harming intent
`self-harm-and-self-injury-glorification`	Content that normalizes or encourages self-harm

Configuration

{
  "conversation_types": [
    {
      "conversation_type_uuid": "11111111-1111-4111-8111-111111111016",
      "name": "Social Audio Session",
      "short_description": "A live voice conversation between participants on a social or gaming platform.",
      "detailed_description": "Participants are users of an online platform communicating in real time. They may know each other or be matched strangers. There is no professional relationship and no moderation present during the conversation."
    }
  ],
  "participant_roles": [
    {
      "participant_role_uuid": "22222222-2222-4222-8222-222222222009",
      "name": "Participant",
      "short_description": "A user in the session.",
      "detailed_description": "Any user participating in the voice session. All participants are treated equally — there is no host, moderator, or authority figure present in the conversation."
    }
  ],
  "behaviors": [
    "preset:harassment",
    "preset:hate",
    "preset:hateful-or-violent-ideology-propagation",
    "preset:sexual-harassment",
    "preset:violent-graphic-material",
    "preset:sexually-graphic-material",
    "preset:child-safety-violation",
    "preset:suicidal-and-self-injurious-ideation",
    "preset:self-harm-and-self-injury-glorification",
    "preset:misogyny",
    "preset:racism",
    "preset:homophobia",
    "preset:transphobia",
    "preset:xenophobia",
    "preset:ableism",
    "preset:sizeism",
    {
      "behavior_uuid": "<generate-a-uuid>",
      "name": "Cheating Discussion",
      "short_description": "Participants discuss using exploits or unauthorized tools to gain advantage.",
      "detailed_description": "This behavior is present if the speech meets all of the following criteria: the speech contains a reference to using third-party software, aimbots, wallhacks, speed hacks, or other unauthorized modifications to gain a competitive advantage; the reference is in the first person or is directed at another participant as a suggestion or instruction. Do not flag if the speaker is reporting another player's behavior to a moderator or describing the cheating in clearly hypothetical or educational terms."
    },
    {
      "behavior_uuid": "<generate-a-uuid>",
      "name": "Personal Information Solicitation",
      "short_description": "A participant attempts to collect another participant's real-world personal information.",
      "detailed_description": "This behavior is present if the speech contains a direct request for information that would identify a participant outside the platform, including full name, home address, phone number, school name, or social media handles not already shared publicly. The request must be directed at another participant in the second person. Do not flag requests for in-game usernames, platform display names, or game-related identifiers."
    }
  ],
  "stt": {
    "speaker_diarization": true,
    "pii_phi_tagging": true
  },
  "produce_topics": false,
  "produce_topic_sentiments": false,
  "produce_summary": false
}

produce_topics, produce_topic_sentiments, and produce_summary are disabled here to reduce latency on live streams where those outputs are not needed. Enable them if you want session-level reporting alongside real-time moderation.

Code example

This example routes detections to three response tiers: automated action for the highest-severity behaviors, human review queue for moderate signals, and incident logging for everything.

Python

import os, json, asyncio, websockets
from enum import Enum

class ResponseTier(Enum):
    AUTOMATED = "automated"   # mute, kick, or escalate immediately
    REVIEW = "review"         # queue for human moderator
    LOG = "log"               # record for pattern analysis

RESPONSE_ROUTING = {
    "Child Safety Violation": ResponseTier.AUTOMATED,
    "Suicidal and Self Injurious Ideation": ResponseTier.AUTOMATED,
    "Harassment": ResponseTier.AUTOMATED,
    "Racism": ResponseTier.REVIEW,
    "Misogyny": ResponseTier.REVIEW,
    "Homophobia": ResponseTier.REVIEW,
    "Transphobia": ResponseTier.REVIEW,
    "Xenophobia": ResponseTier.REVIEW,
    "Ableism": ResponseTier.REVIEW,
    "Sizeism": ResponseTier.REVIEW,
    "Hate": ResponseTier.REVIEW,
    "Sexual Harassment": ResponseTier.AUTOMATED,
    "Personal Information Solicitation": ResponseTier.AUTOMATED,
}

config = { ... }  # paste your BatchConfig here

async def moderate_session(session_id: str, audio_source):
    url = f"wss://modulate-developer-apis.com/api/velma-2-streaming?api_key={os.environ['MODULATE_API_KEY']}"
    incidents = []

    async with websockets.connect(url) as ws:
        await ws.send(json.dumps(config))

        async def send_audio():
            async for chunk in audio_source:
                await ws.send(chunk)
            await ws.send("")

        send_task = asyncio.create_task(send_audio())

        try:
            async for message in ws:
                event = json.loads(message)

                if event["type"] == "behavior_detection":
                    d = event["detection"]
                    if not d["detected"]:
                        continue

                    tier = RESPONSE_ROUTING.get(d["behavior_name"], ResponseTier.LOG)
                    incident = {
                        "session_id": session_id,
                        "behavior": d["behavior_name"],
                        "speaker": d["speaker_label"],
                        "confidence": d["confidence"],
                        "evidence_clips": d["evidence_clip_uuids"],
                        "tier": tier.value,
                    }
                    incidents.append(incident)

                    if tier == ResponseTier.AUTOMATED:
                        print(f"[AUTO ACTION] {d['behavior_name']} — speaker {d['speaker_label']} in session {session_id}")
                        # trigger mute, kick, or escalation via your platform API

                    elif tier == ResponseTier.REVIEW:
                        print(f"[REVIEW QUEUE] {d['behavior_name']} — session {session_id}")
                        # enqueue for human moderator

                elif event["type"] == "done":
                    print(f"[DONE] Session {session_id} complete — {len(incidents)} incidents logged")
                    break

        finally:
            if not send_task.done():
                send_task.cancel()

    return incidents

Reading the output

detected: true with a null confidence score means Velma made a definitive determination without a probabilistic model — treat this the same as a high-confidence detection. skipped: true means Velma did not attempt detection for that behavior in this session. This is normal for behaviors that require a minimum amount of audio or a specific conversational context. Check skip_reason to understand why. evidence_clip_uuids gives you the specific clips to surface in a moderator review or appeals process. Store these alongside the incident record so reviewers can listen to exactly what triggered the detection. pii_phi_tagging is enabled in this config. Any real-world personal information detected in the transcript will be wrapped in entity tags rather than appearing in plain text in clip events — useful if you are storing transcripts.

Adding platform-specific behaviors

The pre-built catalog covers the broad safety signals. The behaviors most specific to your platform — the ones that reflect your community standards, your game’s context, or your TOS — need to be custom. A few patterns that work well for this use case: Scope behaviors to the conversation type. If a behavior only applies to competitive multiplayer sessions and not casual lobbies, set applies_to_conversation_type_uuids to scope it. Describe what the behavior looks like in your context. “Hate speech” is broad. “References to a player’s in-game nation or faction as a stand-in for real-world ethnicity” is specific to your game and far more precise. Write negation criteria for in-game language. Many games have violent themes, dark humor, or in-universe slurs. Define those explicitly in negation criteria so Velma doesn’t conflate in-character speech with real-world harm.

Custom behaviors — adding platform-specific rules
Best practices — writing criteria that hold up across diverse speech
Capabilities — full event schema including skip and error handling

​What this configuration detects

​Configuration

​Code example

​Reading the output

​Adding platform-specific behaviors

​Related

What this configuration detects

Configuration

Code example

Reading the output

Adding platform-specific behaviors

Related