Use SwarmVault when the user needs a local-first knowledge vault that writes durable markdown, graph, search, dashboard, review, chat-session, context-pack,...
---
name: swarmvault
description: "Use SwarmVault when the user needs a local-first knowledge vault that writes durable markdown, graph, search, dashboard, review, chat-session, context-pack, task-ledger, static AI export, retrieval, and MCP artifacts to disk from books, notes, transcripts, exports, datasets, slide decks, files, URLs, code, and recurring source workflows."
version: "3.15.0"
metadata: '{"openclaw":{"requires":{"anyBins":["swarmvault","vault"]},"install":[{"id":"node","kind":"node","package":"@swarmvaultai/cli","bins":["swarmvault","vault"],"label":"Install SwarmVault CLI (npm)"}],"emoji":"๐๏ธ","homepage":"https://www.swarmvault.ai/docs"}}'
---
# SwarmVault
Use this skill when the user wants a local-first knowledge vault built on the [LLM Wiki](https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f) pattern โ three layers (raw sources, wiki, schema) where the LLM maintains a durable wiki between you and raw sources. Also use it when the project already contains `swarmvault.config.json` or `swarmvault.schema.md`.
For onboarding, examples, command references, or troubleshooting, read the bundled `README.md`, `examples/`, `references/`, and `TROUBLESHOOTING.md` before improvising workflow advice.
## Quick checks
- Work from the vault root.
- Use `swarmvault next` when you need a read-only orientation command before deciding whether to initialize, ingest, compile, query, review, or refresh.
- If the vault does not exist yet, run `swarmvault init`.
- Use `swarmvault demo --no-serve` when the user wants the fastest zero-config walkthrough before pointing SwarmVault at their own sources.
- Use `swarmvault quickstart <file-or-directory-or-github-url>` as the beginner-friendly first-run path when the user wants init + ingest + compile + graph viewer in one command.
- Use `swarmvault scan <file-or-directory-or-github-url> --no-serve`, `swarmvault scan <file-or-directory-or-github-url> --no-viz`, or `swarmvault clone <file-or-directory-or-github-url> --no-viz` when the user wants the fastest scratch pass over a local file, local repo, public GitHub repo, or docs tree without manually stepping through init + ingest + compile first; for GitHub URLs add `--branch`, `--ref`, or `--checkout-dir` when the user needs a pinned checkout. Use `scan --mcp` or `clone --mcp` when the next step should be an MCP stdio server. Use `swarmvault graph share --post` for copyable text, `swarmvault graph share --svg [path]` for a visual card, or `swarmvault graph share --bundle [dir]` for a portable folder with markdown, post text, SVG, HTML preview, and JSON metadata.
- Use `swarmvault context build "<goal>" --target <path-or-node> --budget <tokens>` when the next agent, review, or handoff needs a bounded evidence pack instead of a broad vault search.
- Use `swarmvault chat "question"` when a multi-turn conversation should survive handoff; resume with `swarmvault chat --resume <id> "follow-up"` and inspect saved transcripts under `wiki/outputs/chat-sessions/`.
- Use `swarmvault export ai --out <dir>` when another agent, crawler, or static workflow needs `llms.txt`, full text, JSON-LD graph data, a manifest, and per-page siblings without starting `graph serve`.
- Use `swarmvault task start "<goal>" --target <path-or-node>` when agent work should leave a durable task ledger with decisions, linked context packs, changed paths, outcomes, and follow-ups. The older `memory` command remains a compatibility alias.
- Use `swarmvault doctor` before broad troubleshooting or agent handoff; add `--repair` when the retrieval index can be safely rebuilt. In `swarmvault graph serve`, the workbench shows prioritized next actions, every doctor check with details, copyable suggested commands, and safe direct repair where available.
- Read `swarmvault.schema.md` before compile or query work. It is the vault's operating contract.
- If `wiki/graph/report.md` exists, use it before broad repo search.
- If `SWARMVAULT_OUT` is set, resolve generated artifacts from that output root: `raw/`, `wiki/`, `state/`, `agent/`, and `inbox/` live there while `swarmvault.config.json` and `swarmvault.schema.md` stay in the project root.
## Core loop
1. Run `swarmvault next` when the current vault state is unclear; it is read-only and returns paths, checks, and recommended commands.
2. Initialize a vault with `swarmvault init` when needed.
3. Update `swarmvault.schema.md` before a serious compile. Use it for naming rules, categories, grounding, freshness expectations, and exclusions.
4. Use `swarmvault source add <input>` when the input is a recurring local file, local directory, public GitHub repo root, or docs hub that should stay registered. For public GitHub repos, use `--branch`, `--ref`, or `--checkout-dir` when a branch, tag, commit, or reusable checkout matters.
5. Ingest one-off inputs with `swarmvault ingest <path-or-url>`, or ingest a whole repo tree with `swarmvault ingest <directory>`. Audio and video files use `tasks.audioProvider` when configured; local video needs `ffmpeg`, public video URLs use `swarmvault ingest --video <url>` / `swarmvault add --video <url>` with `yt-dlp`, and supported YouTube URLs go through direct transcript capture instead of generic URL ingest.
6. Use `swarmvault ingest --guide`, `swarmvault source add --guide`, `swarmvault source reload --guide`, `swarmvault source guide <id>`, or `swarmvault source session <id>` when the human should integrate one source at a time before canonical pages change. Set `profile.guidedIngestDefault: true` in `swarmvault.config.json` to make guided mode the default; use `--no-guide` to override. Profiles using `guidedSessionMode: "canonical_review"` stage approval-queued canonical edits; `insights_only` profiles keep exploratory synthesis in `wiki/insights/`. Use `--review` only for the lighter review-only path.
7. Use `swarmvault inbox import` for capture-style batches, then `swarmvault watch --lint --repo` when the workflow should stay automated. Add `--code-only` when the refresh should stay AST-only and defer non-code semantic re-analysis to a later `compile`. On tracked repos, code-only changes take that faster compile path automatically. Install `swarmvault hook install` when git checkouts and commits should trigger the same repo-aware code-only refresh automatically.
8. Compile with `swarmvault compile`, use `compile --max-tokens <n>` when the generated wiki must stay inside a bounded context budget, or use `compile --approve` when changes should go through the local review queue first.
9. Resolve staged work with `swarmvault review list|show|accept|reject` and `swarmvault candidate list|promote|archive`.
10. Ask questions with `swarmvault query "<question>"`. It saves durable answers into `wiki/outputs/` by default; add `--no-save` only for ephemeral checks. When an embedding provider is configured, query can merge semantic page matches into local search; `retrieval.rerank: true` lets the current `queryProvider` rerank the merged top hits before answering.
11. Use `swarmvault chat "question"` for a persisted multi-turn conversation over the compiled wiki, then resume or manage it with `swarmvault chat --resume <id>`, `chat --list`, and `chat --delete <id>`.
12. Build agent handoff bundles with `swarmvault context build "<goal>" --target <path-or-node> --budget <tokens>`. Use `--format markdown|json|llms` for the printed shape, and inspect `swarmvault context list|show|delete` for saved packs.
13. Start a task ledger with `swarmvault task start "<goal>" --target <path-or-node>`, update it with `swarmvault task update <id> --note|--decision|--changed-path|--context-pack`, finish it with `swarmvault task finish <id> --outcome <text>`, and use `swarmvault task resume <id> --format markdown|json|llms` for the next-agent handoff. `query`, `explore`, and `context build` can attach work with `--task <id>`; `--memory <id>` remains a compatibility alias.
14. Run `swarmvault export ai --out <dir>` when the compiled wiki should be handed to another agent or static crawler as `llms.txt`, full text, JSON-LD, manifest metadata, and per-page `.txt`/`.json` siblings.
15. Run `swarmvault doctor [--repair]` when the vault needs one health summary across graph, retrieval, review queues, watch state, migrations, managed sources, and task state before deeper troubleshooting.
16. Use `swarmvault explore "<question>" --steps <n>` for save-first multi-step research loops, or `--format report|slides|chart|image` when the artifact should be presentation-oriented.
17. Run `swarmvault lint` whenever the schema changed, artifacts look stale, or compile/query results drift. Set `profile.deepLintDefault: true` in `swarmvault.config.json` when the advisory deep-lint pass should be the default, and use `--no-deep` when you need a structural-only run. Add `--web` only when deep lint is enabled and a `webSearch.tasks.deepLintProvider` adapter is configured; web evidence is scoped to deep lint and does not change compile or query behavior.
18. Use `swarmvault mcp` when another agent or tool should browse, search, query, build context packs, manage tasks, and inspect vault or retrieval health from the vault through MCP.
19. Use `swarmvault graph share --post` when the user needs a quick copyable summary, `swarmvault graph share --svg [path]` when they need a 1200x630 visual card, `swarmvault graph share --bundle [dir]` when they need a portable share kit for posting, linking, or screenshotting, `swarmvault graph blast <target>` when they want reverse-import impact analysis, `swarmvault graph status [path]` or `swarmvault check-update [path]` when they need a read-only stale check before deciding between `graph update` and `compile`, `swarmvault graph stats` when they need lightweight counts and relation mix, `swarmvault graph validate [graph] --strict` when a graph artifact should be checked before export, merge, push, or publish workflows, `swarmvault graph update [path] --force` or `swarmvault update [path] --force` only when a large node/edge shrink is expected, `swarmvault watch [path] --once --code-only` when one repo root should be refreshed without writing watch config, `swarmvault graph query "<seed>" --context calls --evidence extracted` when traversal should focus on relation groups, evidence classes, node types, or languages, `swarmvault graph tree [--output <html>]` or `swarmvault tree [--output <html>]` when they need an interactive source/module/symbol tree with a node inspector, `swarmvault graph merge <graph...> --out <path>` or `swarmvault merge-graphs <graph...> --out <path>` when they need to combine SwarmVault or node-link graph JSON, `swarmvault graph cluster [--resolution <n>]` or `swarmvault cluster-only [vault]` when they need communities and graph report artifacts recomputed without re-ingest, `swarmvault graph serve` when the live workspace, health workbench, Memory dashboard, or bookmarklet clipper will help, `swarmvault diff` when they need a graph-level change summary against the last committed baseline, or `swarmvault graph export --html <output>` / `graph export --report <output>` when richer sharing will help. The live workbench exposes prioritized next actions, explicit capture modes, title/tag capture fields, context-pack/task token budgets, and action receipts; the bookmarklet sends page titles and selected text into the same capture path. `graph export` also supports `--html-standalone`, `--json`, `--obsidian`, `--canvas`, and `--neo4j` for lighter, Obsidian-native, or Neo4j-ready sharing.
## Working rules
- Prefer changing the schema before re-running compile when organization or grounding is wrong.
- Treat `wiki/` and `state/` as first-class outputs. Inspect them instead of trusting a single chat answer.
- Use saved chat transcripts and static AI exports as durable handoff artifacts when the user asks for continuity across sessions or tools.
- Prefer `wiki/graph/report.md`, `state/graph.json`, and saved wiki pages over ad hoc broad search when they already exist.
- Use `swarmvault graph status [path]` or `swarmvault check-update [path]` before refreshing a tracked repo when you need to know whether a code-only `graph update`/`update` is enough or a full `compile` is required.
- Use `swarmvault graph validate [graph] --strict` before sharing, merging, pushing, or publishing graph artifacts when reference integrity matters.
- Use `source add` for recurring files, directories, public GitHub repo roots, and docs hubs. Use `ingest` and `add` for deliberate one-off inputs.
- When the vault lives in a git repo, `ingest|compile|query --commit` can commit `wiki/` and `state/` changes immediately after the run.
- The default heuristic provider is a valid local/offline starting point. Add a model provider only when the user wants richer synthesis quality or optional capabilities such as embeddings, vision, image generation, or audio transcription. The recommended fully-local setup is Ollama + Gemma: `ollama pull gemma4` then set `providers.llm` to `{ type: "ollama", model: "gemma4" }` and point `tasks.compileProvider`, `tasks.queryProvider`, and `tasks.lintProvider` at it.
- Audio and video ingest need `tasks.audioProvider` to resolve to a provider that exposes `audio` capability. For a fully local setup, run `swarmvault provider setup --local-whisper --apply` โ installs the `local-whisper` provider, downloads a whisper.cpp ggml model into `~/.swarmvault/models/`, and points `tasks.audioProvider` at it. Local video also needs `ffmpeg`; public video URL ingest with `--video` needs `yt-dlp`. YouTube transcript ingest does not need a provider. Set `graph.communityResolution` when the user wants to pin community clustering instead of using the adaptive default and oversized/low-cohesion split pass, or run `swarmvault graph cluster --resolution <n>` for a one-off recompute.
- If an OpenAI-compatible backend cannot satisfy structured generation, reduce its declared capabilities instead of forcing every task through it.
- Keep raw sources immutable. Put corrections in schema, new sources, or saved outputs rather than manually rewriting generated provenance.
## Files and artifacts
- `swarmvault.schema.md`: vault-specific compile and query rules.
- `SWARMVAULT_OUT`: optional output root for generated artifact directories. When set, `raw/`, `wiki/`, `state/`, `agent/`, and `inbox/` are resolved under that directory.
- `raw/sources/` and `raw/assets/`: canonical source storage.
- `wiki/`: generated pages plus saved outputs.
- `wiki/outputs/source-briefs/`: saved onboarding briefs for managed sources.
- `wiki/outputs/source-sessions/`: resumable guided-session anchors plus question/answer history for one-source-at-a-time integration.
- `wiki/outputs/source-reviews/`: staged source-scoped review pages.
- `wiki/outputs/source-guides/`: staged source-integration guides for one-source-at-a-time workflows.
- `wiki/outputs/chat-sessions/`: persisted markdown transcripts for `swarmvault chat`.
- `wiki/dashboards/`: recent sources, reading log, timeline, source sessions, source guides, research map, contradiction, and open-question dashboards.
- `wiki/graph/share-card.md`, `wiki/graph/share-card.svg`, and `wiki/graph/share-kit/`: post-ready text, visual graph summaries, and a portable HTML-preview share bundle generated on compile.
- `wiki/exports/ai/`: default static AI handoff export with `llms.txt`, `llms-full.txt`, `graph.jsonld`, `manifest.json`, `ai-readme.md`, and optional per-page siblings.
- `wiki/context/`: markdown context-pack companions for agent kickoff, PR review, and handoff.
- `wiki/memory/`: task ledger index and markdown task pages.
- `wiki/code/`: module pages for ingested JavaScript, JSX, TypeScript (including `.mts`/`.cts`), TSX, Bash/shell script (with shebang-based detection for extensionless scripts), Python, Go, Rust, Java, Kotlin, Scala, Dart, Lua, Zig, C#, C, C++ (including `.c`/`.cc`/`.cpp`/`.cxx` and `.h`/`.hh`/`.hpp`/`.hxx`), PHP, Ruby, PowerShell (`.ps1`/`.psm1`/`.psd1`), Elixir (`.ex`/`.exs`), OCaml (`.ml`/`.mli`), Objective-C (`.m`/`.mm`), ReScript (`.res`/`.resi`), Solidity (`.sol`), Vue single-file components (`.vue`), Svelte single-file components (`.svelte`), HTML (`.html`/`.htm`), CSS, Julia (`.jl`), Verilog/SystemVerilog (`.v`/`.vh`/`.sv`/`.svh`), R (`.r`/`.R`), and SQL (`.sql`) sources. Julia and Verilog/SystemVerilog use packaged WASM grammars; JS/TS capture static and dynamic imports; SQL adds table/view symbols plus read/write/join/reference graph edges; R emits an explicit diagnostic until a safe packaged parser exists.
- `state/extracts/`: extracted markdown and JSON sidecars for PDF, the full Word family (`.docx`/`.docm`/`.dotx`/`.dotm`), RTF (`.rtf`), OpenDocument (ODT/ODP/ODS), EPUB, CSV/TSV, the full Excel family (`.xlsx`/`.xlsm`/`.xlsb`/`.xls`/`.xltx`/`.xltm`), the full PowerPoint family (`.pptx`/`.pptm`/`.potx`/`.potm`), Jupyter notebooks (`.ipynb`), BibTeX (`.bib`), Org-mode (`.org`), AsciiDoc (`.adoc`/`.asciidoc`), transcripts, Slack exports, email, calendar, audio transcripts, video transcripts, YouTube transcript captures, and image sources (`.png`/`.jpg`/`.jpeg`/`.gif`/`.webp`/`.bmp`/`.tif`/`.tiff`/`.svg`/`.ico`/`.heic`/`.heif`/`.avif`/`.jxl`), plus structured previews for config/data files (JSON/JSONC/JSON5/TOML/YAML/XML/INI/ENV/PROPERTIES/CFG/CONF) and content-sniffed text ingest for developer manifests (`package.json`, `Cargo.toml`, `go.mod`, `LICENSE`, `.gitignore`, `Dockerfile`, `Makefile`, and similar plaintext files).
- `state/code-index.json`: repo-aware code aliases and local import resolution data.
- `wiki/projects/`: project rollups over canonical pages.
- `wiki/candidates/`: staged concept and entity pages awaiting promotion.
- `state/graph.json`: compiled graph.
- `state/context-packs/`: saved JSON context-pack artifacts with citations, token-budget accounting, included items, and omitted items.
- `state/chat-sessions/`: saved structured chat state for resumable wiki conversations.
- `state/memory/tasks/`: saved JSON task ledger records with decisions, changed paths, outcomes, and follow-ups.
- `state/retrieval/`: local retrieval index, SQLite FTS shard, and manifest.
- `state/sources.json` and `state/sources/<id>/`: managed-source registry entries plus working sync state.
- `state/approvals/`: staged review bundles from `compile --approve`.
- `state/sessions/`: canonical session artifacts for compile, query, explore, lint, watch, review, and candidate actions.
- `state/jobs.ndjson`: watch-mode run log.
## Agent integration
- `swarmvault install --agent codex|claude|cursor|goose|pi|gemini|opencode|aider|copilot|trae|claw|droid|kiro|hermes|antigravity|vscode|amp|augment|adal|bob|cline|codebuddy|command-code|continue|cortex|crush|deepagents|firebender|iflow|junie|kilo-code|kimi|kode|mcpjam|mistral-vibe|mux|neovate|openclaw|openhands|pochi|qoder|qwen-code|replit|roo-code|trae-cn|warp|windsurf|zencoder` installs agent-specific rules into the current project. Agents in the extended roster receive a project-level skill bundle at the tool's conventional skills directory.
- `swarmvault init`, `quickstart`, `scan`, and `clone` leave project-local agent rule files alone by default. Use `install --agent` explicitly, or set `agents` in `swarmvault.config.json` and pass `--install-agent-rules` to initialization or scan commands when configured installs are intentional.
- `swarmvault install --agent codex|claude|opencode|gemini|copilot --hook` installs graph-first hook or plugin support for the agents that expose project hook APIs.
- `swarmvault install --agent aider` installs `CONVENTIONS.md` and wires `.aider.conf.yml` to read it when that config is valid YAML.
- `swarmvault install --agent antigravity` writes `.agents/rules/swarmvault.md` and `.agents/workflows/swarmvault.md`; reinstall removes older fully managed `.agent/` files.
- `swarmvault mcp` exposes tools and resources for page search, page reads, source listing, graph stats, graph clustering refresh, community lookup, hyperedges, query, context-pack build/read/list, task start/update/finish/list/read/resume, compatibility memory tasks, vault doctor, retrieval status/rebuild/doctor, ingest, compile, and lint.
## Defaults to preserve
- Keep raw source material immutable under `raw/`.
- Save useful answers unless the user explicitly wants ephemeral output.
- Prefer reviewable flows such as `compile --approve`, `review`, and `candidate` when a change should not activate silently.
- Treat provider setup as part of serious vault operation. If only `heuristic` is configured, say so clearly.
- When a vault uses the `profile` block in `swarmvault.config.json`, respect it as the deterministic behavior layer. `swarmvault.schema.md` still defines the human intent layer.
don't have the plugin yet? install it then click "run inline in claude" again.