Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then genera...
---
name: yumnb
description: "Yum NoteBook — a local-first NotebookLM alternative for AI agents, built for real source capture. Ingest a web URL, YouTube video, or screenshot, then generate (1) AI summary, (2) dual-host talk-show MP3 (edge-tts), (3) slide deck (image/table/bullet/flowchart) and optionally upload artifacts to OneDrive/Google Drive/S3/etc. and post a notification to Slack/Discord/Teams. Language and cloud destination are user-configurable; defaults to English with male+female English hosts. Use when: yumnb, study notes, summarize this link, turn this video into notes, help me understand this article."
argument-hint: "URL | YouTube link | path to screenshot (and optional --title / --language)"
---
# Yum NoteBook (yumnb) — Source → Summary + Talk-show + Slides
> **A local-first NotebookLM alternative for AI agents, built for real source capture.**
## What It Does
Given a source (URL / YouTube / screenshot / raw text), creates **one folder per
request** under `<output_dir>/<YYYYMMDD-HHMM-slug>/` containing:
1. **`source/`** — raw material (downloaded HTML, transcript, screenshot copy …)
2. **`summary.md`** — AI summary (one-liner / key points / facts / takeaways)
3. **`talkshow.txt` + `talkshow.mp3`** — dual-host script + MP3 (edge-tts)
4. **`deck.pptx`** — slide deck (bullets, tables, flow, images, summary)
5. **`links.json`** — record of what was generated and any share links
This “one request = one folder” model is intentional: many such folders can
accumulate into a local knowledge base that can be read, searched, narrated,
and presented later.
If a webhook is configured, a notification is posted to Slack / Discord /
Teams Workflow. If `deliver.provider` is configured, yumnb can also push the
finished outputs directly to an IM/chat surface through OpenClaw / Hermes
(Telegram / Discord / Teams / Slack / etc.).
You can position yumnb as a local-first, polite alternative to NotebookLM:
it keeps notebooks and generated artifacts as ordinary local files by default,
then only uploads or delivers them if you explicitly configure that.
## Two Ways to Run
### A. Fully-automatic (built-in AI provider)
Configure `ai.provider` in `config.yaml` (openai / anthropic / gemini /
ollama / cli) and run:
```bash
python -m yumnb auto "<URL or path>" [--title "short name"]
```
This runs ingest → AI summary → AI slide-plan → TTS → PPT → publish in one
shot.
### B. Step-by-step (agent-driven — you bring your own LLM)
If you're driving this from an agent CLI (GitHub Copilot CLI, Claude Code,
Cursor, Aider, …), set `ai.provider: none` and call the subcommands
individually. The agent reads the source, writes `summary.md` and
`deck.json` itself, then asks yumnb to render TTS / PPT / publish.
```bash
# 1) Pull raw material
python -m yumnb ingest "<URL>" [--title "..."]
# → prints the note folder path
# 2) (Agent writes <folder>/summary.md following the schema in README)
# 3) Render dual-host MP3 from a talkshow script the agent wrote
python -m yumnb tts "<folder>/talkshow.txt" --output "<folder>/talkshow.mp3"
# 4) Render PPT from a deck.json the agent wrote
python -m yumnb ppt "<folder>/deck.json" --output "<folder>/deck.pptx"
# 5) Finalize + optional webhook / IM delivery
python -m yumnb publish "<folder>"
```
## Schemas the Agent Writes
### `summary.md`
```markdown
# <title>
> **Source**: <url or file>
> **Type**: youtube|url|image|text
> **Length**: <duration or word count>
## 🎯 One-line summary
## 📌 Key points (3-5)
## 🔑 Facts / data
## 💡 Takeaways
## 🤔 Open questions
```
### `deck.json` (rendered to `deck.pptx`)
```json
{
"title": "Deck title",
"subtitle": "Source / date",
"slides": [
{"type": "title", "title": "...", "subtitle": "..."},
{"type": "bullets", "title": "...", "bullets": ["...", "..."]},
{"type": "table", "title": "...", "headers": ["A","B"], "rows": [["1","2"]]},
{"type": "flow", "title": "...", "steps": ["Step 1","Step 2","Step 3"]},
{"type": "image", "title": "...", "image_path": "/abs/path.png", "caption": "..."},
{"type": "two_column", "title": "...", "left": "bullet text", "image_path": "..."},
{"type": "summary", "title": "...", "text": "..."}
]
}
```
Recommended: **5–12 slides** — title, 1-2 overview, 3-6 main (bullets/table/
flow/image), 1 summary. Reuse images from `source/` (YouTube thumbnail,
HTML hero image, original screenshot).
### `talkshow.txt`
Lines tagged with `[<SpeakerName>]` where `<SpeakerName>` matches a voice
configured in `config.yaml` → `tts.voices`. Example:
```
[HostA] Welcome to the show — today we're chewing on…
[HostB] And by chewing I mean ruthlessly mocking, right?
[HostA] Pretty much.
```
## Prerequisites
- Python 3.9+
- Preferred first-run: `./scripts/bootstrap.sh`
- Or manual: `pip install -r requirements.txt`
- Plus the AI SDK matching your provider (only one): `openai` / `anthropic` /
`google-generativeai` / `ollama` — or none if you use `provider: cli` /
`none`.
## Notes
- `edge-tts` uses Microsoft's free online voices. No API key required.
- The intro/outro jingle is generated procedurally in pure Python — no
external assets bundled.
- YouTube ingest order is: yt-dlp manual subtitles → yt-dlp auto subtitles →
`youtube-transcript-api` fallback → description-only fallback.
- This skill carries no platform/tenant/organization-specific defaults.
All endpoints and credentials come from `config.yaml` or environment
variables (`YUMNB_*`, `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc.).
- Direct IM delivery is channel-agnostic: configure `deliver.provider:
openclaw` (or `hermes`) plus `deliver.openclaw.channel` + `target` to send
the finished note to Telegram, Discord, Teams, Slack, and other supported
surfaces via the local bridge.
## Language
`config.yaml` → `language` (default `en`). Influences both AI prompt
language and the default TTS voice pair. Override per-run with
`--language en|zh|ja|es|fr|de|...`.
Built-in default voice pairs (male + female, edge-tts):
| `language` | HostA (male) | HostB (female) |
| ---------- | ------------------------- | ------------------------- |
| `en` | `en-US-AndrewNeural` | `en-US-AvaNeural` |
| `zh` | `zh-CN-YunyangNeural` 云飞 | `zh-CN-XiaoxiaoNeural` 小晓 |
| `ja` | `ja-JP-KeitaNeural` | `ja-JP-NanamiNeural` |
| `es` | `es-ES-AlvaroNeural` | `es-ES-ElviraNeural` |
| `fr` | `fr-FR-HenriNeural` | `fr-FR-DeniseNeural` |
| `de` | `de-DE-ConradNeural` | `de-DE-KatjaNeural` |
Override any pair (or add new languages) under `tts.language_voices`.
Setting `tts.voices` directly always wins.
## Cloud upload (OneDrive / Google Drive / S3 / Dropbox / …)
`config.yaml` → `upload.provider: rclone` makes `publish` upload the
generated mp3 / pptx / summary to your configured cloud and inline the
shareable URLs in `links.json` and the notification payload — so users
get one-click mp3 + ppt links instead of local `file://` URIs.
```yaml
upload:
provider: rclone
rclone:
remote: "onedrive:yumnb" # or gdrive:yumnb, s3:bucket/yumnb, etc.
share: true
```
Set it up once with `rclone config` (see https://rclone.org). yumnb
delegates everything to rclone so the same skill works with every
backend rclone supports.
don't have the plugin yet? install it then click "run inline in claude" again.