Lightning-Fast Voice Tools

The three ways your voice agent can reach a capability - Query, Tool, and Dispatch - and how to add a new one (or wrap an existing one) so a phone call gets the answer in under two seconds instead of waiting 20 to 90.

The one idea everything rests on

During a call, the voice brain (Gemini) can only do one thing: emit a tool call. It never runs your script or calls OpenClaw on its own - it calls a function you gave it, and the gateway decides what happens behind that function.

So the question is never whether something should be a tool call. It always is. The only question is: what does that tool call do behind the scenes?

You have three completely different backends to choose from. Picking the right one is most of getting a fast, reliable call.

Query vs Tool vs Dispatch

These are three tool calls the voice model can make. They look similar to the caller, but underneath they are different machines.

1. OpenClaw Query - openclaw_query

What it is: the gateway hands your request, in plain English, to the full OpenClaw agent. The agent thinks - it reads your skills, decides which to use, runs scripts, and writes back a sentence.

Speed: slow. A simple answer is a few seconds; anything that runs a tool is 17 to 90 seconds (the model reasons before and after every step).

Use it for: open-ended “figure it out” requests where a pause is acceptable - not a known lookup mid-call.

Behind the scenes
caller -> openclaw_query("do I have new email?") -> OpenClaw agent turn ->
reads skill -> runs script -> reasons -> "You have 3 new..."
                                          [flexible, but seconds-to-minutes]

2. Tool use - a registered tool via /tools/invoke

What it is: the gateway calls one specific, pre-registered OpenClaw tool directly. No model turn. The tool runs, returns structured data, done.

Speed: lightning. A local lookup is ~50-200ms; a tool that hits the network (email over IMAP) is ~0.4-1.3s. No agent reasoning in the loop.

Why it is reliable: the tool returns real structured data (sender, subject, date), so the model is handed the facts and cannot fabricate a result. This is the path this guide is about.

Use it for: known, repeatable reads you want fast - “what was my last email,” “what is on my calendar,” “look up this customer.”

Behind the scenes
caller -> search_email tool -> /tools/invoke {tool:"email_search"} ->
script -> JSON -> spoken
                  [no model, sub-second, can't lie]

3. Dispatch - openclaw_dispatch

What it is: fire-and-forget. The gateway hands a job to the agent to run in the background, gets an instant acknowledgement, and the result is delivered later to a channel (e.g. Telegram), not spoken on the call.

Speed: the call does not wait at all - the ack is immediate; the work finishes after you hang up.

Use it for: slow or heavy work that should not hold the line - “research X and send me a summary,” “draft and email me that report.” Anything 15s+ that the caller does not need read back live.

Behind the scenes
caller -> openclaw_dispatch("research X, email me") -> instant "On it" ->
[agent works in background] -> result arrives in Telegram later

The one-line rule of thumb

You wantUseWhy
An open-ended answer, pause is OKopenclaw_queryflexible; the agent figures it out
A known lookup, fast, on the call/tools/invokeno model turn, deterministic
A long job, answer can come lateropenclaw_dispatchdoes not block the call

Choosing the transport for a fast tool

Once a capability deserves to be a fast Tool, three questions pick exactly how to wire it:

  1. Does it run during a live call? Yes - then it must be a fast transport, never openclaw_query.
  2. Read or write? Read - safe to make fast. Write or send - never on the raw fast path (see the safety rule below).
  3. Where do the credentials live? On the host (mailbox password in ~/.openclaw/.env, a local CRM script) - use /tools/invoke. In the cloud (an OAuth token, a SaaS key) - use a voicebridge (signed HTTPS to a small cloud service), like Wix or Stripe.

Email is the textbook host-local case: live + read + the IMAP password is on the host. So: a /tools/invoke tool. The rest of this guide builds exactly that.

Add a fast tool in four steps

One-time rail (set up once): a small gateway transport (auth.type: "openclaw-tool") and a host “tool-bridge” plugin must exist first. They are built once; after that, every new tool is just Steps 1, 2 and 4 below - a script, a manifest line, and a config entry.

Step 1 - Have a script that prints JSON (or reuse one)

The tool runs an ordinary script. It must print its result as JSON to stdout, be read-only, and read its own credentials (the existing skill scripts already load ~/.openclaw/.env). You almost never write this from scratch - you wrap an existing one. Email already ships moltbot_email.py, which prints clean JSON:

Terminal
python3 ~/.openclaw/workspace/tools/moltbot_email.py search --query=proposal --since=7 --limit=10
# -> {"count": 2, "messages": [{"from": "...", "subject": "...", "date": "...", "snippet": "..."}]}

Step 2 - Register the script as a tool (the bridge manifest)

The tool-bridge turns a manifest entry into a real, registered OpenClaw tool. Add one entry - no new code:

voice-tool-bridge.json
// host: in the agent workspace data dir
{
  "tool": "email_search",
  "description": "Search the owner's mailbox. Returns sender, subject, date, snippet.",
  "interpreter": "python3",
  "script": "tools/moltbot_email.py",
  "subcommand": "search",
  "readOnly": true,                 // hard gate - non-read entries are refused
  "argMap": { "query": "--query", "since_days": "--since", "limit": "--limit" },
  "parameters": { "type": "object", "properties": {
    "query": { "type": "string" }, "since_days": { "type": "number" }, "limit": { "type": "number" }
  }, "required": [] },
  "timeoutMs": 4000
}

The OpenClaw gateway then restarts so the plugin re-registers the tool - the fleet's plugin tooling does this for you (the gateway's service name is deployment-specific, so there is no single command to hardcode). Under the hood the bridge spawns your script safely - argument array, no shell, values passed as --flag=value so a value like --id cannot be mis-read as an option. You do not write this per tool.

Step 3 - Verify it directly (no voice needed)

/tools/invoke is OpenClaw's always-on endpoint for running one tool with no model. Prove the tool works before touching the phone:

Terminal
TOK=$(jq -r '.gateway.auth.token' ~/.openclaw/openclaw.json)
curl -sS http://127.0.0.1:18789/tools/invoke \
  -H "Authorization: Bearer $TOK" -H 'Content-Type: application/json' \
  -d '{"tool":"email_search","args":{"query":"proposal","since_days":7}}'
# -> {"ok":true,"result":{"content":[{"type":"text","text":"{\"count\":2,...}"}]}}

Step 4 - Let the voice agent use it (config, no release)

Add a custom-tool entry that routes the voice model's tool call to /tools/invoke. This is delivered by config pull - no gateway rebuild:

voice-tools.json
// add to customTools[]
{
  "name": "search_email",                 // the name the voice model sees (one prompt line)
  "description": "Search the owner's email by sender/keyword/date.",
  "parameters": { "type": "object", "properties": {
    "query": { "type": "string", "description": "keyword or sender" },
    "since_days": { "type": "number", "description": "only emails newer than N days" }
  }, "required": [] },
  "auth": { "type": "openclaw-tool", "toolName": "email_search" },   // the fast host-local transport
  "timeoutMs": 4000
}

Now call the bot and ask “what was my last email?” - it answers from real data, in about a second, with no fabrication. Adding the next fast tool is the same three edits: a JSON-printing script, a manifest line, a voice-tools.json entry.

Wrap, don't rewrite

You rarely build a new capability - you expose one you already have. The bridge entry is a thin adapter over the existing script:

  • The script (moltbot_email.py, crm_search.py, a gog call) stays exactly as it is.
  • The manifest maps the voice tool's typed parameters to the script's existing flags.
  • Nothing about the agent's own use of that skill changes.

This keeps one implementation of “read email” that both the agent and the voice fast-path share, instead of two copies drifting apart.

Reads are fast; writes are special

This is the one rule you must not get wrong. /tools/invoke runs as a full operator and skips every approval prompt. That is fine for a read. It is not fine for a write or send - putting an email-send tool on the raw fast path would let a phone caller send mail with owner privileges and no approval gate at all.

  • Reads (search email, check calendar, look up a contact, order status): fast Tool via /tools/invoke. Safe.
  • Writes / sends (send email, publish a post, book on the owner's behalf): never raw /tools/invoke. Use a purpose-built tool that enforces safety server-side - the send_whatsapp_text pattern, where the recipient is fixed to the caller's own number (injected server-side from the call context, never taken from the model) and a 24-hour messaging-window guard is enforced server-side - or the existing approval-gated agent path.

The bridge refuses to register any non-read-only entry, so a write cannot accidentally land on the fast path. The dangerous primitives (exec, shell, spawn, file writes) are blocked over /tools/invoke by default - leave them blocked.

Cheat sheet

Pick the backend

Open-ended, pause OK ........ openclaw_query     (seconds-minutes, flexible)
Known read, fast on-call .... /tools/invoke tool (sub-second, deterministic)  <- build these
Long job, reply can wait .... openclaw_dispatch  (instant ack, result later)

Pick the transport for a fast tool

creds on the host  -> /tools/invoke + a bridge tool   (email, CRM, gog-calendar)
creds in the cloud -> voicebridge (signed HTTPS)        (Wix, Stripe, OAuth calendar)

Add a host-local fast tool

1. a script that prints JSON (reuse an existing one)   read-only
2. one manifest line in voice-tool-bridge.json  +  gateway restart
3. curl /tools/invoke to verify
4. one entry in voice-tools.json  (auth.type: "openclaw-tool")   no release

Never put a send or write on raw /tools/invoke. Never allow exec over HTTP.

FAQ & gotchas

How fast is it, really?

A pure local read is ~50-200ms. A network read (email over IMAP) is ~0.4-1.3s, because the IMAP login dominates - the transport adds almost nothing. Both are 15-70x faster than openclaw_query. For repeat asks within a call, add a short-TTL cache so the second ask is instant.

Why not just preload the data into the prompt?

That is what the CRM does, and it is great for small, known-at-call-start data. It does not scale to many capabilities - every preloaded tool bloats the prompt. The fast Tool path keeps the prompt lean (one line per tool) and fetches on demand.

My tool returns not_found from /tools/invoke

Two causes: (a) it is not registered - check the manifest and that you restarted the gateway; (b) a restrictive tool policy is filtering it. The fleet runs OpenClaw's default permissive policy, so registered tools are reachable. If a bot ever sets a restrictive allowlist, add the tool to that agent's allow list - a known fix, not a bug.

Why pass args as --flag=value and not --flag value?

Some scripts use argparse; a value that starts with a dash (e.g. a query of --id) gets mis-parsed as an option in the space form. The equals form binds it as a value. The bridge does this for you.

Does this work on every bot?

Yes for OpenClaw bots - the same rail and manifest format work everywhere; per-bot differences (Gmail vs Titan IMAP) are handled by the script reading that host's ~/.openclaw/.env. Hermes-based bots do not have /tools/invoke; they use the cloud voicebridge path instead.

Is /tools/invoke safe to rely on long-term?

Yes - it is a core, always-on, documented OpenClaw endpoint, verified live on the fleet, not an experimental feature.

See also

Questions? Reach out at hello@talktomyagent.io