Connect a self-hosted Hermes agent

Run Talk To My Agent voice with a Hermes agent (NousResearch) as the brain instead of the default backend. When you finish, your number answers with a real-time voice agent powered by Hermes: it converses, remembers the call, uses your Hermes persona, and records and reports calls. Budget 30 to 45 minutes.

How it fits together

Caller -> Telephony -> TTMA voice gateway -> session shim -> Hermes API server
                       (speaks /v1/responses)  (tiny proxy)   (port 8642)

The TTMA gateway talks to its brain over the OpenAI Responses API (POST /v1/responses), and Hermes speaks that API too. There is exactly one mismatch to bridge: conversation memory.

The gateway tracks a call by sending a stable x-openclaw-session-key header each turn. Hermes is stateless per request, ignores unknown headers, and threads a conversation via a conversation field in the body. So you run a tiny session shim that copies the header into Hermes's conversation field. That gives full in-call memory and per-caller isolation. Everything else is configuration.

Prerequisites

- A TTMA Voice subscription with a provisioned number and an install token (from your dashboard).

- A Linux host already running your Hermes agent, with Node.js installed.

Step 1: Enable the Hermes API server

Set these (in Hermes's config.yaml or environment) and restart Hermes:

API_SERVER_ENABLED=true
API_SERVER_KEY=<choose-a-long-secret>
# defaults are fine: API_SERVER_HOST=127.0.0.1, API_SERVER_PORT=8642

Verify it is up:

curl -s http://127.0.0.1:8642/v1/health
# expect: {"status":"ok"}

Keep API_SERVER_KEY handy; you give it to the gateway in Step 4.

Step 2: Add the session shim (required for in-call memory)

Without this the bot forgets each turn. Save this as ttma-hermes-shim.js. It has no dependencies and forwards every request to Hermes, copying the gateway's session key into Hermes's conversation field so turns chain and callers stay isolated:

const http = require('http');
const HERMES_HOST = '127.0.0.1', HERMES_PORT = 8642, LISTEN_PORT = 8788;

http.createServer((req, res) => {
  const chunks = [];
  req.on('data', c => chunks.push(c));
  req.on('end', () => {
    let body = Buffer.concat(chunks);
    const key = req.headers['x-openclaw-session-key'];
    // Map the gateway's session key -> Hermes 'conversation' so turns chain.
    if (req.url === '/v1/responses' && key && body.length) {
      try {
        const o = JSON.parse(body.toString('utf8'));
        if (o && typeof o === 'object' && o.conversation == null) o.conversation = String(key);
        body = Buffer.from(JSON.stringify(o), 'utf8');
      } catch (e) { /* not JSON: forward unchanged */ }
    }
    const headers = Object.assign({}, req.headers, { host: HERMES_HOST + ':' + HERMES_PORT });
    if (key) headers['x-hermes-session-key'] = String(key);
    headers['content-length'] = Buffer.byteLength(body);
    const up = http.request(
      { host: HERMES_HOST, port: HERMES_PORT, method: req.method, path: req.url, headers },
      r => { res.writeHead(r.statusCode, r.headers); r.pipe(res); });
    up.on('error', e => { res.writeHead(502); res.end('shim error: ' + e.message); });
    up.end(body);
  });
}).listen(LISTEN_PORT, '127.0.0.1', () => console.log('shim on ' + LISTEN_PORT));

Run it so it stays up (pick one):

# quick test
node ttma-hermes-shim.js

# or keep it running (example with pm2)
pm2 start ttma-hermes-shim.js --name ttma-hermes-shim && pm2 save

The shim listens on 127.0.0.1:8788 and forwards to Hermes on 8642. It passes the Authorization header straight through, so the gateway's token is the Hermes key (Step 4).

Step 3: Install the TTMA voice gateway

Run the standard installer with your install token:

curl -sSL https://api.talktomyagent.io/install.sh | bash -s -- --token <YOUR_TOKEN> --accept-license

A warning like “could not read OpenClaw token” is expected on a Hermes box; you set the backend next. Note the install directory it prints (typically ~/ninja-talk/); you will edit two files there.

Step 4: Point the gateway at Hermes

Edit ~/ninja-talk/.env. The installer already set OPENCLAW_WORKSPACE (to ~/.openclaw/workspace) and OPENCLAW_HOME; the only two lines you change are:

OPENCLAW_URL='http://127.0.0.1:8788'            # point at the shim (NOT 8642)
OPENCLAW_TOKEN='<your Hermes API_SERVER_KEY>'   # from Step 1

Point OPENCLAW_URL at the shim (8788), not Hermes directly, or in-call memory will not work.

Step 5: Give the voice agent its persona

The gateway builds the spoken persona from SOUL.md plus the voice playbook in OPENCLAW_WORKSPACE. The installer already placed the playbook and context manifest there. Add your persona so the voice and the brain match (these files use the same format in Hermes and TTMA):

cp ~/.hermes/SOUL.md ~/.openclaw/workspace/SOUL.md
# optional, if you have it:
cp ~/.hermes/USER.md ~/.openclaw/workspace/USER.md

Hermes reads its own copy from ~/.hermes for the brain; the gateway reads this copy for the live voice.

Step 6: Tune features for a non-OpenClaw backend

Edit ~/ninja-talk/config.json (change these keys; leave the rest of the file as the installer wrote it):

{
  "context": { "files": ["SOUL.md", "USER.md"] },
  "features": { "openclawQuery": true, "kbSearch": false, "crmCapture": false }
}

openclawQuery: true

Keep it on. Despite the name, this is the “ask the brain” tool that calls /v1/responses; it is what makes Hermes answer.

kbSearch: false

The built-in knowledge-base search reads OpenClaw's local memory database, which a Hermes box does not have. Turn it off so the agent relies on Hermes for knowledge instead.

crmCapture: false

The bundled CRM capture uses OpenClaw-only workspace tools that are not present on a Hermes box.

Leave dispatch off

Do not set the dispatch variables (OPENCLAW_HOOKS_TOKEN / OPENCLAW_DISPATCH_CHANNEL / OPENCLAW_DISPATCH_TO). That “long task, deliver to chat later” tool targets an OpenClaw-only endpoint and stays disabled by default.

Step 7: Restart and verify

Restart the gateway (use whichever your install created):

systemctl --user restart ninja-talk

First, confirm the path works without spending a call. This should return JSON with an output array of type:"output_text":

curl -s http://127.0.0.1:8788/v1/responses \
  -H "Authorization: Bearer <your Hermes API_SERVER_KEY>" \
  -H "Content-Type: application/json" \
  -H "x-openclaw-session-key: test-123" \
  -d '{"model":"openclaw","input":"Say hello in five words."}'

Then place a real call and run the memory check: tell the bot a fact (“my name is Dana”), then a turn later ask “what is my name?” If it answers correctly, the shim is working. Watch logs while you test:

journalctl --user -u ninja-talk -f

What works, and what needs extra steps

Capability	With Hermes
Real-time conversation with in-call memory	Works (via the shim)
Persona / identity (SOUL.md etc.)	Works (same format)
Recording, transcripts (pull + webhook), outbound calls	Works (backend-independent)
Knowledge from context files / playbook	Works
Long-task dispatch to a chat channel	OpenClaw-only; stays off
Bundled CRM-capture tools	Rebuild as Hermes skills

For the two unsupported items, the work is on the Hermes side: implement the same behaviors as Hermes skills, which Hermes invokes when it answers /v1/responses.

Troubleshooting

“Bot forgets things mid-call”

The shim is not running, or OPENCLAW_URL does not point at the shim (:8788). Confirm the shim is up and that the Step 7 curl returns a coherent answer when you reuse the same x-openclaw-session-key.

“I could not reach the agent backend”

Hermes or the shim is down, or OPENCLAW_URL / OPENCLAW_TOKEN is wrong. Check /v1/health on 8642 and the Step 7 curl on 8788.

“Generic or wrong persona”

SOUL.md is missing from OPENCLAW_WORKSPACE, or OPENCLAW_WORKSPACE is not set in .env. Re-check Steps 4 and 5.

“401 from the backend”

OPENCLAW_TOKEN does not match Hermes's API_SERVER_KEY.