How Your Voice Agent Thinks

A tour of what goes into every phone call: the three layers of context that always load, the optional tools that may or may not be installed, and the simple rule that decides which file you should edit.

The big picture

Every time someone calls your number, the TTMA Voice Gateway opens a new session and assembles a fresh prompt from a handful of files on your server. Three layers always load. A fourth optional layer plugs in only if you've enabled it.

TTMA Voice Gateway
assembled on every call
1
IdentityAlways loaded
SOUL.md · IDENTITY.md · MEMORY.md
2
Operational rulesAlways loaded
dojo-voice-agent-playbook.md
3
This-call briefPer call only
purposeTemplate · firstMessage
Optional toolsOff by default on lean setups
kb_search · CRM lookup · openclaw_query · custom HTTP tools

Most operators only ever touch one of these layers at a time. Knowing which layer holds which knob is the entire game.

Layer 1 - Identity (always loaded)Every call

Who the agent is. Three Markdown files in your OpenClaw workspace, read on every call. If you change them, the next call uses the new version. No restart needed.

FileRoleWhat goes in it
SOUL.mdVoice & personalityTone, pacing, conversational style. The bot's vibe.
IDENTITY.mdName & roleWho the agent is, what it represents, which number it answers on.
MEMORY.mdLong-term memoryKnown contacts, learned facts, recurring preferences. The bot's accumulated context.

These three files are shared with your OpenClaw agent. Edit them once and both the chat agent and the voice agent inherit the changes.

Layer 2 - Operational rules (always loaded)Every call

How the agent handles calls. One Markdown playbook at:

~/.openclaw/workspace/protocols/dojo-voice-agent-playbook.md

This is where you write:

  • Hard rules (reply length, when to escalate, what never to say)
  • Pricing facts and other things the bot must always know
  • Conversation flow (how to open, how to discover the caller's goal, how to close)
  • Examples of good answers - short, contextual, mapped to caller intent

Like Layer 1, the playbook is re-read on every call. Edit it, hang up, call back - the new version is live.

Layer 3 - This-call brief (per-call only)This call only

The first two layers are about the bot itself. This third layer is about this specific call. It's injected only for the duration of one session, then thrown away.

For outbound calls

  • purposeTemplate - "You're calling Sarah at Acme about her overdue invoice"
  • firstMessage - the exact opening line the bot speaks first
  • Any per-call data your app passes in (lead score, account state, last-touch date)

For inbound calls

  • Caller phone number (and, if known, their CRM record)
  • Whether they're the owner (private mode) or a stranger (public mode)
  • Which playbook section to enter on

Layer 3 is the only layer your code touches at runtime. Everything else lives on disk and is static for the lifetime of the deployment.

Optional tools (only if you enable them)Off by default on lean setups

The three layers above are enough for a fully functional voice agent - answering questions, discovering goals, closing the call. Tools add the ability to do things mid-call: look something up, save a contact, hand off to an external system.

Each tool you enable adds a tool definition to every call's system prompt (a few hundred tokens) and gives the model the option to call it. More tools = more capability, but also more latency and more chances the model picks the wrong action. Turn them on as you need them.

kb_search
Search your knowledge base mid-call
When to enable: Public-mode bots fielding FAQ-style questions from an indexed KB.
CRM lookup
Pull caller record at call start, save new ones
When to enable: Outbound sales/collections, or inbound to known customers.
openclaw_query
Ask your OpenClaw agent anything
When to enable: Private-mode owner calls - email, calendar, complex tasks.
Custom HTTP tools
Define your own JSON-schema tools
When to enable: Anything else: bookings, checkout, internal APIs.

A clean info-only bot like our public demo runs with zero tools enabled. Every answer comes from Layers 1 and 2. Zero tool latency, zero tool-call errors, perfectly predictable conversations.

The optimization rule

When you want to change something about how your agent behaves, this table tells you which file to open.

If it changes…Edit hereWhy
Every call (tone, persona, role)SOUL.md / IDENTITY.mdIdentity is stable across the deployment
Every call (rules, scripts, pricing)dojo-voice-agent-playbook.mdThe bot's operational playbook is one file
Recurring facts the bot must always knowMEMORY.mdLong-term memory survives across all calls
One particular callpurposeTemplate / firstMessage in your API callLayer 3 is per-session and doesn't leak into other calls
What the bot can do (capabilities)voice-tools.json on the serverTools are gateway-side feature toggles, not prompt edits

Common mistakes

Hardcoding a single caller's name in MEMORY.md
Fix: Pass that data per-call via the Voice API (Layer 3). MEMORY.md is for facts that should be true on every call, not for the next one.
Putting per-call instructions in the playbook
Fix: Use the purposeTemplate field in the API call. The playbook should describe "how the agent works in general," never "what to do on this specific call."
Enabling every tool by default
Fix: Start with all tools off. Turn each one on only when you have a concrete use case. Fewer tools = lower latency, fewer surprises.
Editing config.json when you really wanted a prompt change
Fix: config.json is for audio knobs, timeouts, and feature flags. Words the bot says live in the playbook and the three Layer-1 files.

Quick setup checklist

New deployment? Walk through these in order. Most operators finish in 15 minutes.

  1. 1
    Write IDENTITY.md
    Name, role, what the agent represents.
  2. 2
    Write SOUL.md
    Tone, pacing, conversational style.
  3. 3
    Seed MEMORY.md with canonical facts
    Pricing, hours, the things the bot must always know.
  4. 4
    Customize the playbook
    Hard rules, opening flow, discovery questions, end-of-call behavior.
  5. 5
    Decide which tools you need
    Default to off. Turn each on only with a clear reason.
  6. 6
    Place a test call from your owner phone
    Confirm the bot says the right thing first and follows your rules.

TL;DR

  • Always loaded: SOUL, IDENTITY, MEMORY, and the playbook.
  • Per call: purposeTemplate + firstMessage.
  • Optional: tools, only if you enabled them.
  • Edit where the change belongs. Identity in SOUL/IDENTITY, rules in the playbook, this-call data in the API.
  • Less is more. Fewer tools, shorter prompts, faster calls.

Keep reading

Questions? Reach out at hello@talktomyagent.io.