Cryptographically signed audit trails for AI agent tool calls. Every tool invocation is HMAC-SHA256 signed. Full arguments and results logged. Live dashboard...
---
name: AgentPathfinder
description: "Cryptographically signed audit trails for AI agent tool calls. Every tool invocation is HMAC-SHA256 signed. Full arguments and results logged. Live dashboard shows which command failed, what the error was, whether agent claimed success falsely. The provenance layer for agent execution."
---
# AgentPathfinder
> **[Company Brain Core OS](https://certainlogic.ai/brain) — Free, local, deterministic knowledge base for your agent. Start here before adding audit trails.**
**Your AI agent said "Done." Prove it.** Cryptographically signed tool-level audit trails. See exactly which command failed, what the error was, and whether your agent lied about success.
---
## Part of the CertainLogic Stack
This skill works **standalone**. Install it, sign your tool calls, get proof of execution.
**Works even better with [Company Brain Core OS](https://certainlogic.ai/brain)** (`clawhub install company-brain-os`):
- Brain stores your audit patterns and flags suspicious signatures
- **CertainLogic Smart Router** — routes audit queries to the right tier
- **Skill Vetter Plus** — verifies audit integrity before external review
**All three are independent.** Use Pathfinder alone when you need verifiable execution. Add Brain when you want smart correlation of audit events.
---
## What This Is
The **provenance layer for AI agent execution.** Every tool call is cryptographically signed with HMAC-SHA256. Every failure is recorded. Every lie is detectable.
AgentPathfinder doesn't make your agent honest. It makes dishonesty **proveable.**
## What You Get
| Feature | How It Works | Truth Status |
|---------|-------------|--------------|
| ✅ Signed task tracking | Green = agent signed claim of completion | NOT verified — recorded only |
| ❌ Signed failure tracking | Red = agent signed claim of failure | NOT verified — recorded only |
| 🔒 Cryptographic signing | HMAC-SHA256 on every claimed event | Signature is real |
| 📋 Tamper-evident audit | Unauthorized edits break HMAC | Tampering detected if no key |
| 🔍 Tool-level audit | Every TOOL_INVOKED and TOOL_RESULT logged | Shows what actually ran |
| 📊 Live dashboard | Auto-refreshing HTML with tool call tree | Real-time HMAC verification |
| 🕒 Hanging call detection | Detects TOOL_INVOKED with no TOOL_RESULT | Catch stuck tools |
| 🧾 Fraud alerts | Alerts when agent claims success but tool errored | See false claims |
| 🔄 Crash recovery | Atomic writes + fsync + rename | Works |
| 💬 Chat notifications | SDK callbacks send updates to agent's chat | Real-time status |
**Pro (coming soon):** Multi-agent tracking, audit exports, webhooks.
**Enterprise:** On-prem, SSO/SAML, hosted vault.
## What This Is NOT
| ❌ Does NOT | Why |
|-------------|-----|
| Verify tasks are actually complete | It records claims. You must verify the work independently. |
| Prevent agents from lying | It signs what the agent says. False claims get signed too. |
| Replace human verification | It helps you track claims. You still need to check the actual work. |
## The Problem
Your AI agent says **"Done"** — but you have no idea what actually happened at the tool level.
- **Silent failures:** The agent claims success while `docker push` returned `exit_code=1`
- **Missing steps:** The agent skips `pytest` entirely and claims all tests passed
- **Wrong commands:** The agent runs `echo "done"` instead of the actual build
- **Crash recovery:** The agent restarts, has no memory of which tool calls were in-flight
- **Multi-agent confusion:** Two agents claim the same step — who actually did it?
You only find out when your customer does. 😤
## Installation
```bash
clawhub install certainlogicai.agentpathfinder
```
## Quick Start
```bash
# Create a signed task
python3 -m agentpathfinder create deploy "test,build,push,restart"
# → Task created: deploy-a7f3d2e1
# Start the audit chain for a step
# (automatically done when agent runs tools)
# View the live dashboard
cd ~/.openclaw/skills/certainlogicai.agentpathfinder
python3 scripts/dashboard_v130.py watch --task deploy-a7f3d2e1
# → Serves dashboard at http://localhost:8080
# → Auto-refreshes every 2 seconds
# → Shows tool call tree with args, results, HMAC signatures
# → Fraud alerts: hanging calls, false claims, missing results
# View audit trail
python3 -m agentpathfinder audit deploy-a7f3d2e1
# → Lists all signed tool events
# Export audit trail
python3 scripts/dashboard_v130.py export --task deploy-a7f3d2e1
# → Signed JSONL with every event + HMAC verification
```
## SDK Usage
```python
from agentpathfinder import TaskEngine
from agentpathfinder.tool_audit import ToolAuditChain, AuditedToolExecutor
# Initialize tool audit
task = TaskEngine(data_dir="~/.agentpathfinder")
step_list = [
{"name": "test"},
{"name": "build"},
{"name": "push"},
{"name": "restart"},
]
task_id = task.create_task("deploy", step_list)
# Start audit for a step
audit = task.get_tool_audit(task_id, step_number=1)
# Manually log a tool call
tool_id = audit.log_tool_call("exec", {"command": "pytest tests/ -v"})
# ... run the command ...
audit.log_tool_result(tool_id, result={"passed": True, "count": 17}, exit_code=0)
# → Logs TOOL_INVOKED with args + HMAC
# → Logs TOOL_RESULT with exit_code + output + HMAC
# Or use the executor wrapper (auto-logs real commands)
executor = AuditedToolExecutor(audit)
result = executor.exec("echo hello") # Auto-logged
# Detect fraud
audit.detect_hanging_calls() # Tools invoked but no result
audit.get_tool_summary() # What was called, what failed
```
## CLI Reference
```bash
python3 -m agentpathfinder create <name> <steps> # Create task (comma-separated steps)
python3 -m agentpathfinder status <task_id> # View task status
python3 -m agentpathfinder audit <task_id> # Verify HMAC signatures
python3 -m agentpathfinder reset-step <task_id> <n> # Reset step (signed action)
```
## Dashboard
```bash
# Live watch mode with built-in HTTP server
python3 scripts/dashboard_v130.py watch --task <task_id>
# → http://localhost:8080
# Generate static report
python3 scripts/dashboard_v130.py generate --task <task_id>
# Export as JSON
python3 scripts/dashboard_v130.py export --task <task_id>
```
**Dashboard shows:**
- Tool call tree (args + results visible)
- HMAC signature for every event
- Integrity verification panel (tampered/corrupted count)
- Fraud alerts (hanging calls, false claims)
- Tool summary by category
- Multi-task support
- Dark mode with CertainLogic brand colors
## Safety
**Read SAFETY.md before using in production.**
Key limitations:
- Agent with filesystem access can read vault shards — **not a sandbox**
- False completions get signed — system records claims, not truth
- No independent observer — agent controls both execution AND logging
## License
MIT License © CertainLogic
don't have the plugin yet? install it then click "run inline in claude" again.