Skip to main content
The quality of your behavior detection is directly tied to the quality of your detailed_description. Velma reads your description as instructions — the more precisely you define the signal, the more reliably it identifies it.

Start with the title and short description

Before writing your detailed_description, nail the name and short_description first. These aren’t labels — they set the frame for everything that follows.
  • name — keep it under five words. It should be specific enough that someone unfamiliar with your setup understands immediately what the behavior catches. “Aggressive Language” is okay. “Customer Threat Toward Agent” is better.
  • short_description — one sentence, the dictionary definition of the behavior. If you can’t summarize it in one sentence, the behavior is probably trying to do too much. Consider splitting it.
Once these are locked, use them as the definition to check your criteria against as you write the detailed_description.

Write criteria, not examples

Your detailed_description should be a list of explicit conditions that must be met for the behavior to fire. This is not a list of examples. Examples narrow down what the system looks for and produce unreliable results when the audio doesn’t match them closely. Criteria define the underlying structure the examples would satisfy. Do this:
This behavior is present if the speech meets ALL of the following criteria:
- The speech features a conditional statement where a negative action is tied to compliance with a demand.
- The demand is made in the first or second person voice.
- The speech is not a mutual negotiation where the other party is invited to propose an alternative.
Not this:
Examples of this behavior include: "If you don't fix this I'll leave a one-star review", 
"I'll tell all my friends to cancel", "You'll be hearing from my lawyer."

Ground criteria in concrete language

Vague terms like “insults,” “hateful language,” or “inappropriate content” require Velma to make its own interpretive judgment — and that judgment may not match what you intended. Ground your criteria in concrete linguistic elements: sentence structure, subject-verb relationships, tense, person, and specific word types. Vague:
The speech must contain hateful or insulting language.
Concrete:
The speech must contain at least one sentence where the subject is described using 
adjectives with clearly negative connotations, or where the subject is the target 
of a threatened harmful action expressed in future tense.
The more you can translate the behavior into something closer to a grammar or structure check rather than a social context judgment, the more consistent your results will be.

Don’t restate the title in the description

If your behavior is named “Coercion Manipulation” and your description says “identify patterns of harassment and intimidation,” you haven’t told Velma anything it didn’t already infer from the title. The detailed_description must go deeper — into specific structural and linguistic markers — not sideways into synonyms. Whenever you find yourself using words that mean the same thing as the behavior name, that is a signal the criteria need more work.

Specify who can trigger a behavior

Many behaviors are only meaningful when a specific participant exhibits them. A question like “Can you confirm the account number?” is routine from an agent, but a potential fraud signal from a customer. Without scoping, you will generate false positives. Use applies_to_participant_role_uuids to restrict detection to specific roles:
{
  "behavior_uuid": "your-uuid-here",
  "name": "Account Verification Reversal",
  "short_description": "Customer attempts to solicit account verification from the agent.",
  "detailed_description": "...",
  "applies_to_participant_role_uuids": ["22222222-2222-4222-8222-222222222017"]
}
If role-based scoping alone isn’t sufficient, you can also encode the role context directly in the detailed_description:
The speech must feature a request for personal account information (such as account number, 
date of birth, or security question answers) made by the customer — not the agent. 
Do not flag if the request originates from an agent following a verification procedure.

Include negation criteria

Once you have defined what the behavior catches, spend equal time on what it should not catch. Negation criteria prevent false positives from content that superficially resembles the target signal but isn’t what you’re after. A useful technique: think of the most common thing that looks like the behavior but clearly isn’t, then write a rule that excludes it.
This behavior should NOT be flagged if:
- The statement is in the past tense and references a third party not present in the conversation.
- The statement is a scripted disclosure read by the agent at the start of the call.
- The speaker explicitly acknowledges the refusal and disengages.

Define your umbrella terms

Broad category terms like “personal information,” “sensitive topics,” or “inappropriate content” will be interpreted by Velma using its own judgment. That judgment may not match what you have in mind for your specific industry or context. Always follow a broad term with an explicit list of what it includes. Without definition:
The speech must contain references to personal information.
With definition:
Personal information, for the purposes of this behavior, includes: full name, 
date of birth, account number, home address, and social security number.

Scope to context, not universality

A behavior that works well in one context rarely translates cleanly to another. A “return fraud” behavior scoped to an online clothing retailer will have different signals than return fraud at a food delivery service. Trying to write a single behavior that handles every edge case produces a definition that is harder to tune and less precise. Write a tighter version for the context you are actually working in. You can always create multiple variants for different conversation types using applies_to_conversation_type_uuids.

Checklist before you ship

1

Title and short description are locked

Name is under five words and unambiguous. Short description summarizes the behavior in one sentence.
2

Criteria are explicit conditions, not examples

The detailed description is a list of testable rules, not illustrative cases.
3

Criteria are grounded in language structure

Terms like “subject,” “adjective,” “future tense,” “conditional framing” rather than “insults,” “hostility,” “inappropriate content.”
4

Negation criteria are present

At least one “do not flag if” rule covering the most common false positive case.
5

Role scope is set if relevant

Behaviors that only apply to one participant type are scoped via applies_to_participant_role_uuids or have role context in the description.
6

Umbrella terms are defined

Any broad category term is followed by a concrete list of what it means in this context.
7

Configuration is stored externally

Velma does not persist behavior definitions between sessions. Your BatchConfig — including all UUIDs and description text — is stored in your own system so you can reproduce it reliably.

What to avoid

If your detailed_description contains the words “any,” “all,” or “every” without a bounded list, that is a signal the scope is too broad. These words tend to expand coverage in ways that are hard to predict and produce inconsistent results.
PatternProblemFix
Description restates the titleAdds no new information for VelmaGo deeper into linguistic structure
Criteria are examplesMisses cases that don’t match the examples closelyRewrite as explicit conditions
No negation criteriaHigh false positive rate for similar-looking contentAdd “do not flag if” rules
Umbrella terms without definitionsVelma applies its own interpretationList what the term means in your context
One behavior catches too muchInconsistent results, harder to tuneSplit into multiple specific behaviors