Hermes Agent Tracing with Litefuse
Hermes Agent is Nous Research’s terminal-based AI assistant with a pluggable tool-calling and gateway architecture. This integration installs as a Hermes plugin under ~/.hermes/plugins/litefuse/. The plugin runs in-process inside Hermes’ Python interpreter and subscribes to lifecycle hook events (pre_llm_call, post_api_request, pre_tool_call, post_tool_call, post_llm_call, etc.), emitting one Litefuse trace per user turn with real wall-clock per-event timestamps.
The plugin works for every Hermes surface — CLI (hermes chat -q), Gateway (Feishu, Discord, Slack, Telegram, WhatsApp, IRC), and (with PR #24237) the TUI.
For AI — automated install
If you’re chatting with Hermes Agent right now, paste this prompt and the agent will handle the whole install end-to-end:
Read https://litefuse.ai/SKILL.md and follow the instructions to install and configure Litefuse for Hermes Agent.
The skill will ask for your Litefuse API keys (or walk you through signing up if you don’t have an account yet), then configure everything in place. For step-by-step manual setup instead, continue below.
What gets captured
| Data | Captured as | Notes |
|---|---|---|
| User prompt | trace input + user message event | flushed at turn start so Litefuse shows trace name / session / input immediately |
| Each LLM API call | api: <model> #N generation observation | per-API-call latency + token usage + provider + base_url + finish_reason |
Token usage (input_tokens, output_tokens, cache read/write) | usage_details on each api generation | from post_api_request.usage; Anthropic-style keys so Litefuse cost mapping works |
| Tool invocation | tool: <name> (#N) tool observation | input = tool args, output = tool result, duration_ms |
| Tool errors | tool observation with level=ERROR | heuristic: error / "error": / "success": false at the start of the result |
| Final assistant response | assistant response generation observation | also propagated to trace-level output |
| Model name | model on each generation, trace tag model:<name> | used by Litefuse for cost computation |
| Session grouping | trace session_id | Hermes’ YYYYMMDD_HHMMSS_<6or8hex> session id |
| User identity | trace user_id | $LITEFUSE_USER_ID, falls back to $USER |
| Turn number | trace name (Hermes Agent — Turn N) + hermes_agent.turn_number | 1-based, monotonically increasing per session |
| Hermes-specific metadata | metadata.hermes_agent.* | provider / base_url / api_mode / api_duration / finish_reason / message_count / step_index / platform / is_first_turn / history_messages |
Trace structure
A typical multi-tool turn produces a trace shaped like:
Hermes Agent — Turn 7 (SPAN, root, real duration)
├── user message (EVENT, flushes trace metadata immediately)
├── api: MiniMax-M2.7 #1 (GENERATION, usage_details, real latency)
├── tool: web_search (#1) (TOOL, real duration)
├── api: MiniMax-M2.7 #2 (GENERATION, usage_details)
├── tool: web_extract (#2) (TOOL, real duration)
├── api: MiniMax-M2.7 #3 (GENERATION, usage_details)
└── assistant response (GENERATION, final answer)Notes on the design:
- Real wall-clock timestamps. Every observation’s start/end is set by the hook callback firing moment, so the Litefuse timeline reflects actual LLM latency, tool execution time, and inter-call gaps — no synthetic grid.
- Single root parent. The container span is the unique trace root; every api / tool / response observation nests underneath. The timeline view stays clean even for 50+ step turns.
- Trace header is stable. Trace name / input / session_id flush to Litefuse at turn START (via the
user messageevent with real wall-clock end +lf.flush()) and never get overwritten mid-stream as new observations land. - Tool numbering.
(#N)suffix ontool:observation names disambiguates repeated calls to the same tool within a turn. - Namespaced metadata. All Hermes-specific fields live under
metadata.hermes_agent.*to keep the top-level metadata dict clean for Litefuse-standard keys.
Quick Start
Prerequisites
- A Litefuse project at https://litefuse.cloud with public + secret keys.
- Hermes Agent installed — check
hermes --version. Recommend the latest version with PR #24237 merged so plugin hooks fire in the TUI; otherwise CLI + Gateway surfaces still work fully.
Install Langfuse SDK v4 into Hermes’ own venv
The plugin runs in-process inside Hermes’ Python interpreter, so the SDK has to live in Hermes’ venv (not a separate one):
~/.hermes/hermes-agent/venv/bin/pip install 'langfuse>=4,<5'
# Verify
~/.hermes/hermes-agent/venv/bin/python3 -c "import langfuse; print(langfuse.__version__)"
# Expect: 4.x.yDrop the plugin files into ~/.hermes/plugins/litefuse/
mkdir -p ~/.hermes/plugins/litefuse
curl -fsSL https://litefuse.ai/integrations/hermes-agent/plugin.yaml \
-o ~/.hermes/plugins/litefuse/plugin.yaml
curl -fsSL https://litefuse.ai/integrations/hermes-agent/__init__.py \
-o ~/.hermes/plugins/litefuse/__init__.pyThe plugin source is at the same URLs — read it before deploying if you want.
Add Litefuse credentials to ~/.hermes/.env
The plugin reads credentials from Hermes’ own .env file (which Hermes auto-loads at startup):
cat >> ~/.hermes/.env <<'EOF'
# Litefuse observability
LANGFUSE_PUBLIC_KEY=pk-lf-xxx
LANGFUSE_SECRET_KEY=sk-lf-xxx
LANGFUSE_HOST=https://litefuse.cloud
EOFReplace the placeholder values. The plugin also accepts LANGFUSE_BASE_URL as an alias for LANGFUSE_HOST for compatibility with other Litefuse integrations.
Enable the plugin
Hermes opt-in is required:
hermes plugins enable litefuse
hermes plugins list # confirm: status = enabledRestart any running gateway
The gateway daemon caches its plugin manager at startup, so a restart is needed for the new plugin to load:
hermes gateway restart # only if you have `hermes gateway` runningCLI invocations (hermes chat -q "...") load the plugin fresh on each run — no restart needed.
Verify
hermes chat -q "say hello in five words exactly"
tail -3 ~/.hermes/state/litefuse_plugin.log
# Expected: "turn complete session=YYYYMMDD_HHMMSS_xxxxxx turn=1 api_calls=1 tool_calls=0"Open https://litefuse.cloud and look at the latest trace named Hermes Agent — Turn 1. It should have:
- correct
name/sessionId/inputpopulated immediately - a child
[EVENT] user messagecarrying the user prompt [GENERATION] api: <model> #Nwith token usage filled in- (if your prompt used a tool)
[TOOL] tool: <name> (#1)with input args + output result [GENERATION] assistant responsewith the final answer
Environment variables
| Variable | Required | Description |
|---|---|---|
LANGFUSE_PUBLIC_KEY | Yes | Litefuse project public key (pk-lf-...). |
LANGFUSE_SECRET_KEY | Yes | Litefuse project secret key (sk-lf-...). |
LANGFUSE_HOST | No | Defaults to https://cloud.langfuse.com. Set to https://litefuse.cloud (or your self-hosted URL). Aliased as LANGFUSE_BASE_URL. |
LITEFUSE_USER_ID | No | Overrides the trace user_id. Falls back to $USER then "hermes-user". |
HERMES_LITEFUSE_DEBUG | No | Set to "true" for verbose plugin logging at ~/.hermes/state/litefuse_plugin.log. |
HERMES_LITEFUSE_MAX_CHARS | No | Truncation threshold (in characters) for span inputs/outputs. Default 1000000 (~1MB of text). |
How it works
The plugin subscribes to nine Hermes plugin-hook events via ctx.register_hook(...):
| Hook | Role |
|---|---|
on_session_start | Caches session model / platform metadata. |
pre_llm_call | Opens the long-lived container span for the turn. Emits a user message event that flushes trace metadata (name / session / user / input / tags) to Litefuse immediately so the trace header populates without waiting for the turn to end. |
post_api_request | Emits one api: <model> #N generation per LLM API call, carrying usage_details (input_tokens / output_tokens / cache tokens) for Litefuse cost mapping. |
pre_tool_call | Opens a tool: <name> (#N) span under the container. |
post_tool_call | Closes the matching tool span with output + duration_ms. Uses a per-session FIFO queue because Hermes v0.12’s pre_tool_call call sites in run_agent.py don’t pass session_id / tool_call_id, only task_id. |
post_llm_call | Emits the assistant response generation and closes the container with the final output. |
on_session_end | Defensive cleanup for interrupted turns (Hermes fires this at every turn end, not just session end). |
on_session_reset / on_session_finalize | Drop in-memory state when the session is truly torn down. |
The plugin is fail-open: any unexpected error is logged to ~/.hermes/state/litefuse_plugin.log and the callback returns silently so it never blocks Hermes’ main loop. A single Langfuse client is held in module-level state across the lifetime of the Hermes process (gateway daemon, CLI invocation, TUI subprocess), guarded by a thread lock so multiple concurrent gateway sessions share it safely.
Trace metadata reference
Trace-level metadata (under metadata.hermes_agent.*):
turn_number— 1-based, monotonically increasing per session lifetimeplatform—cli,gateway,tui,feishu, etc. (set by Hermes)model— the model name at turn startis_first_turn,history_messages— turn position in session
Generation observation metadata (api: ..., under metadata.hermes_agent.*):
provider,base_url,api_mode,response_model— provider configapi_call_count,api_duration,finish_reason,message_count— per-call statsassistant_content_chars,assistant_tool_call_count— output shapestep_index— the(#N)step counter
Tool observation metadata (under metadata.hermes_agent.*):
tool_name,task_id,tool_call_idduration_ms,is_errorstep_index
Troubleshooting
No traces appear in Litefuse. Tail ~/.hermes/state/litefuse_plugin.log:
tail -20 ~/.hermes/state/litefuse_plugin.logExpected: at startup "litefuse plugin registered (9 hooks)" then per-turn "Litefuse client ready" and "turn complete session=... turn=N api_calls=N tool_calls=N". If the file is empty, the plugin isn’t loading.
hermes plugins list # confirm: litefuse → enabled
hermes plugins enable litefuse # if not enabled
hermes gateway restart # if gateway is runningPlugin log says Litefuse credentials not set. The plugin couldn’t find LANGFUSE_PUBLIC_KEY / LANGFUSE_SECRET_KEY in os.environ. Add them to ~/.hermes/.env:
cat >> ~/.hermes/.env <<'EOF'
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_HOST=https://litefuse.cloud
EOFThen re-run any hermes chat -q "..." or restart the gateway.
Plugin log says langfuse SDK not importable. The Langfuse SDK isn’t installed in Hermes’ venv:
~/.hermes/hermes-agent/venv/bin/pip install --upgrade 'langfuse>=4,<5'Trace name / session / input missing during a long turn. This was a real bug we fixed by emitting a tiny user message observation (typed as event, structured as a span so Langfuse propagates the AS_ROOT + TRACE_* attributes) and force-flushing it at turn start. If you see this, you’re probably running an older copy of __init__.py — re-fetch from https://litefuse.ai/integrations/hermes-agent/__init__.py.
TUI sessions don’t produce traces. Hermes v0.12 doesn’t load plugins in the TUI process (tui_gateway/entry.py). Either:
- Upgrade to a Hermes version with PR #24237 merged, OR
- Apply the 3-line patch locally to
~/.hermes/hermes-agent/tui_gateway/entry.py(gets clobbered onhermes update— apply again after each update).
CLI (hermes chat -q, hermes -z) and Gateway (Feishu, Slack, Discord, Telegram, etc.) work without the patch.
hermes -z "..." (oneshot mode) doesn’t produce traces either. Same root cause — oneshot’s entry point doesn’t call discover_plugins. Use hermes chat -q "..." instead (full CLI path that does load plugins), or apply the same 3-line treatment to hermes_cli/oneshot.py. We’ll file a companion PR for this.
Limitations
- No assistant
reasoningcapture. Hermes plugin-hook callbacks don’t expose the model’s reasoning / chain-of-thought text directly —post_api_requestonly gives usassistant_content_chars(the count). The thinking text exists in Hermes’ conversation history but harvesting it would require subscribing to per-message events that aren’t currently available. - No image-block summary. If a turn involves image inputs, the plugin records them as JSON-flattened text in the trace input; it doesn’t surface media types or counts as separate metadata.
Resources
- Hermes Agent hooks user guide
- Hermes Agent plugin reference
- Langfuse Python SDK v4
- Litefuse Cloud
- Plugin source:
plugin.yaml·__init__.py - Upstream Hermes Agent PR for TUI plugin discovery: NousResearch/hermes-agent#24237