How Your Voice Agent Thinks

A tour of what goes into every phone call: the three layers of context that always load, the optional tools that may or may not be installed, and the simple rule that decides which file you should edit.

The big picture

Every time someone calls your number, the TTMA Voice Gateway opens a new session and assembles a fresh prompt from a handful of files on your server. Three layers always load. A fourth optional layer plugs in only if you've enabled it.

TTMA Voice Gateway

assembled on every call

IdentityAlways loaded

SOUL.md · IDENTITY.md · MEMORY.md

Operational rulesAlways loaded

dojo-voice-agent-playbook.md

This-call briefPer call only

purposeTemplate · firstMessage

Optional toolsOff by default on lean setups

kb_search · CRM lookup · openclaw_query · custom HTTP tools

Most operators only ever touch one of these layers at a time. Knowing which layer holds which knob is the entire game.

Layer 1 - Identity (always loaded)Every call

Who the agent is. Three Markdown files in your OpenClaw workspace, read on every call. If you change them, the next call uses the new version. No restart needed.

File	Role	What goes in it
`SOUL.md`	Voice & personality	Tone, pacing, conversational style. The bot's vibe.
`IDENTITY.md`	Name & role	Who the agent is, what it represents, which number it answers on.
`MEMORY.md`	Long-term memory	Known contacts, learned facts, recurring preferences. The bot's accumulated context.

These three files are shared with your OpenClaw agent. Edit them once and both the chat agent and the voice agent inherit the changes.

Layer 2 - Operational rules (always loaded)Every call

How the agent handles calls. One Markdown playbook at:

~/.openclaw/workspace/protocols/dojo-voice-agent-playbook.md

This is where you write:

Hard rules (reply length, when to escalate, what never to say)
Pricing facts and other things the bot must always know
Conversation flow (how to open, how to discover the caller's goal, how to close)
Examples of good answers - short, contextual, mapped to caller intent

Like Layer 1, the playbook is re-read on every call. Edit it, hang up, call back - the new version is live.

Layer 3 - This-call brief (per-call only)This call only

The first two layers are about the bot itself. This third layer is about this specific call. It's injected only for the duration of one session, then thrown away.

For outbound calls

purposeTemplate - "You're calling Sarah at Acme about her overdue invoice"
firstMessage - the exact opening line the bot speaks first
Any per-call data your app passes in (lead score, account state, last-touch date)

For inbound calls

Caller phone number (and, if known, their CRM record)
Whether they're the owner (private mode) or a stranger (public mode)
Which playbook section to enter on

Layer 3 is the only layer your code touches at runtime. Everything else lives on disk and is static for the lifetime of the deployment.

Optional tools (only if you enable them)Off by default on lean setups

The three layers above are enough for a fully functional voice agent - answering questions, discovering goals, closing the call. Tools add the ability to do things mid-call: look something up, save a contact, hand off to an external system.

Each tool you enable adds a tool definition to every call's system prompt (a few hundred tokens) and gives the model the option to call it. More tools = more capability, but also more latency and more chances the model picks the wrong action. Turn them on as you need them.

kb_search

Search your knowledge base mid-call

When to enable: Public-mode bots fielding FAQ-style questions from an indexed KB.

CRM lookup

Pull caller record at call start, save new ones

When to enable: Outbound sales/collections, or inbound to known customers.

openclaw_query

Ask your OpenClaw agent anything

When to enable: Private-mode owner calls - email, calendar, complex tasks.

Custom HTTP tools

Define your own JSON-schema tools

When to enable: Anything else: bookings, checkout, internal APIs.

A clean info-only bot like our public demo runs with zero tools enabled. Every answer comes from Layers 1 and 2. Zero tool latency, zero tool-call errors, perfectly predictable conversations.

The optimization rule

When you want to change something about how your agent behaves, this table tells you which file to open.

If it changes…	Edit here	Why
Every call (tone, persona, role)	`SOUL.md / IDENTITY.md`	Identity is stable across the deployment
Every call (rules, scripts, pricing)	`dojo-voice-agent-playbook.md`	The bot's operational playbook is one file
Recurring facts the bot must always know	`MEMORY.md`	Long-term memory survives across all calls
One particular call	`purposeTemplate / firstMessage in your API call`	Layer 3 is per-session and doesn't leak into other calls
What the bot can do (capabilities)	`voice-tools.json on the server`	Tools are gateway-side feature toggles, not prompt edits

Common mistakes

Hardcoding a single caller's name in MEMORY.md

Fix: Pass that data per-call via the Voice API (Layer 3). MEMORY.md is for facts that should be true on every call, not for the next one.

Putting per-call instructions in the playbook

Fix: Use the purposeTemplate field in the API call. The playbook should describe "how the agent works in general," never "what to do on this specific call."

Enabling every tool by default

Fix: Start with all tools off. Turn each one on only when you have a concrete use case. Fewer tools = lower latency, fewer surprises.

Editing config.json when you really wanted a prompt change

Fix: config.json is for audio knobs, timeouts, and feature flags. Words the bot says live in the playbook and the three Layer-1 files.

Quick setup checklist

New deployment? Walk through these in order. Most operators finish in 15 minutes.

1
Write IDENTITY.md
Name, role, what the agent represents.
2
Write SOUL.md
Tone, pacing, conversational style.
3
Seed MEMORY.md with canonical facts
Pricing, hours, the things the bot must always know.
4
Customize the playbook
Hard rules, opening flow, discovery questions, end-of-call behavior.
5
Decide which tools you need
Default to off. Turn each on only with a clear reason.
6
Place a test call from your owner phone
Confirm the bot says the right thing first and follows your rules.

TL;DR

Always loaded: SOUL, IDENTITY, MEMORY, and the playbook.
Per call: purposeTemplate + firstMessage.
Optional: tools, only if you enabled them.
Edit where the change belongs. Identity in SOUL/IDENTITY, rules in the playbook, this-call data in the API.
Less is more. Fewer tools, shorter prompts, faster calls.

Keep reading

Setup & Settings - every knob in config.json, where it lives, and how to change it.
Playbook Customization - section-by-section walkthrough of the operational rules file.
Tools & Skills - when to enable each tool and what it costs you in latency.

Questions? Reach out at hello@talktomyagent.io.