Zouroboros Bench

Benchmark harness for AI memory systems. Evaluates LongMemEval, LoCoMo, and ConvoMem datasets against any memory backend via the zouroboros-memory CLI. Inclu...

installs

stars

karma

SkillRank score ↗

4.3/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-06-16

zouroboros-bench runs memory system evaluations against longmemeval, locomo, and convomem datasets via cli commands, with optional mimir judge for drift detection. installation and basic invocation documented, but operational guidance is sparse.

structure

5.0

trigger phrases

3.0

procedure

4.0

edge cases

2.0

documentation

5.0

strengths

SKILL.md

---
name: zouroboros-bench
description: "Benchmark harness for AI memory systems. Evaluates LongMemEval, LoCoMo, and ConvoMem datasets against any memory backend via the zouroboros-memory CLI. Includes Mimir judge for catching architectural drift."
version: "1.0.0"
compatibility: "OpenClaw, Claude Code, Codex CLI, any Node.js 22+ environment"
metadata:
  author: marlandoj.zo.computer
  openclaw:
    emoji: "📊"
    requires:
      bins: [node]
      env: [OPENAI_API_KEY]
    primaryEnv: OPENAI_API_KEY
    install:
      - id: node-zouroboros-bench
        kind: node
        package: "zouroboros-bench"
        bins: [zouroboros-bench, zouroboros-bench-report]
        label: "Install Zouroboros Bench (npm)"
    homepage: https://github.com/AlaricHQ/zouroboros-openclaw
---

## Usage

Install: `npm install zouroboros-bench zouroboros-memory`

### Run all benchmarks
```bash
npx zouroboros-bench --limit 50
```

### Run specific benchmark
```bash
npx zouroboros-bench --benchmarks longmemeval --limit 100 --judge
```

### Generate report
```bash
npx zouroboros-bench-report --runs ./data/runs/
```

## Environment Variables
- `ZOUROBOROS_MEMORY_CLI` — Path to memory CLI binary (default: `zouroboros-memory`)
- `ZOUROBOROS_MEMORY_DB` — SQLite DB path for benchmarks
- `OPENAI_API_KEY` — Required for GPT-4o judge
- `OLLAMA_URL` — Ollama URL for local LLM (default: `http://localhost:11434`)

don't have the plugin yet? install it then click "run inline in claude" again.

Zouroboros Bench

SKILL.md

related skills