Convert meeting transcripts into structured minutes with decision tracking, task assignment, three-tier summaries, and cross-meeting correlation.
---
name: minutes-taker
display_name: AI Meeting Minutes
description: Convert meeting transcripts into structured minutes with decision tracking, task assignment, three-tier summaries, and cross-meeting correlation.
author: harry
version: 1.1.0
license: MIT
type: skill
tags:
- meeting-minutes
- productivity
- documentation
- collaboration
model: deepseek
created_at: 2026-06-14
---
# AI Meeting Minutes (minutes-taker)
Convert meeting transcripts or text into structured, actionable minutes. The skill extracts decisions, action items, risks, and key dates, generates three-tier summaries (one-liner → three-para → full), and tracks decisions and todos across meetings.
**Core strengths**: Decision chain tracking across meetings, todo lifecycle management (extract → assign → track → remind), and three-level summaries for different consumption contexts.
**Audio support**: ASR (speech-to-text) is available via plugin backends. See [ASR Backends](#asr-backends) below.
## Quick Start
```bash
clawhub run minutes-taker --input-type text \
--content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday."
```
This produces structured minutes with decisions, todos, and a three-level summary.
## Features
| Feature | Description |
|---------|------------|
| **Input Modes** | Plain text, chat logs, structured forms. Audio via ASR backend (see below) |
| **ASR (Speech-to-Text)** | Pluggable: whisper (offline) or SpeechRecognition (Google API). No diarization by default |
| **6 Extractors** | Decisions, Todos, Dates, Risks, Ideas, Data Points |
| **3-Level Summaries** | L1 one-liner, L2 three-paragraph, L3 full minutes |
| **Decision Chain** | Track decisions across meetings with evolution history |
| **Todo Lifecycle** | Auto-extract, assign, deadline parse, priority infer, track across meetings |
| **Cross-meeting Links** | Detect recurring topics, link related decisions and todos |
| **Multiple Outputs** | Markdown, text, HTML, feishu, notion (via extensions) |
## ASR Backends
The `asr.py` module auto-detects available backends at runtime:
| Backend | Quality | Offline | Requirements |
|---------|---------|---------|-------------|
| **whisper** (openai-whisper) | ⭐⭐⭐ High | ✅ Yes | `pip install openai-whisper` |
| **speech_recognition** (Google API) | ⭐⭐ Medium | ❌ No | `pip install SpeechRecognition` (pre-installed) |
Transcription is attempted in priority order: whisper → speech_recognition.
Speaker diarization is NOT performed; the system labels all text as a single speaker.
For multi-speaker audio, provide a participants list or use post-processing.
## Input Format
```json
{
"input": {
"type": "text",
"content": "Zhang: Today we discuss...",
"format": "chat_log"
},
"meeting_context": {
"title": "Q2 Product Roadmap Review",
"date": "2026-06-14",
"participants": [
{"name": "Zhang San", "role": "PM"},
{"name": "Li Si", "role": "Frontend Dev"}
],
"agenda": ["Payment module", "Growth tools"]
},
"options": {
"summary_level": "full",
"extract_todos": true,
"extract_decisions": true,
"output_format": "markdown"
}
}
```
## Sample Prompts
### Prompt 1: Text-based Meeting Minutes (Quick Start)
```bash
clawhub run minutes-taker --input-type text \
--content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday." \
--title "Payment Module Discussion" \
--participants "Zhang San (PM), Li Si (FE), Wang Wu (BE)"
```
**Expected output**: Structured minutes with:
- 📌 Summary: Payment module discussion → Stripe chosen, Li to deliver
- ✅ Todos: Li Si - deliver Stripe proposal by Friday (🔴 High)
- 📊 Decisions: Use Stripe SDK (Zhang proposed, unanimous)
- L1: "Decided to use Stripe for payment module; Li to submit proposal by Friday"
- L2/L3: Full expanded minutes
### Prompt 2: Audio File Processing
```bash
clawhub run minutes-taker --input ./meeting-2026-06-14.m4a \
--title "Q2 Product Roadmap Review" \
--participants @team.json
```
**Expected output**: Transcribed text with structured minutes (decisions, todos, risks, summary).
Requires an ASR backend (see [ASR Backends](#asr-backends)). If no backend is available, returns a clear error guiding installation.
### Prompt 3: Todo Tracking
```bash
clawhub run minutes-taker todos --since last_meeting
```
**Expected output**:
```
📋 Previous Meeting Todo Tracking (2026-06-07)
✅ Complete (3/5):
✅ Li Si · Payment frontend tech proposal (PR #2341 submitted)
⏳ In Progress (1/5):
⏳ Zhao Liu · Growth tool prototype (60%, due 06/16)
❌ Overdue (1/5):
❌ Zhang San · Competitive analysis report (3 days overdue)
```
### Prompt 4: Decision Chain
```bash
clawhub run minutes-taker decisions --topic "Payment Module"
```
**Expected output**:
```
📜 Decision Chain: Payment Module Refactor
06/01 Weekly: "We need to refactor payment module" (Zhang)
06/07 Tech Review: Chose Stripe SDK (Wang proposed, 3:2 passed)
06/14 Roadmap Review: Launch date set to July 20, added stress test → [This meeting]
```
### Prompt 5: Three-Level Summary
```bash
clawhub run minutes-taker --input-type text --content "$(cat transcript.txt)" \
--summary-level three_para --output-format text
```
**Expected output**: Concise three-paragraph summary ready for WeChat/email.
## First-Success Path
**Goal**: Structured minutes from 3 lines of dialogue within 30 seconds.
```
Step 1: clawhub install minutes-taker
Step 2: clawhub run minutes-taker --input-type text \
--content "Zhang: Today we discuss payment. Li: I suggest Stripe. Wang: OK. Zhang: Li, make proposal by Friday."
Step 3: Internal pipeline:
a. input.py parses text, identifies speakers
b. segmenter.py (single topic, no split needed)
c. extractor.py extracts: 1 decision, 1 todo, 0 risks
d. summarizer.py generates L1/L2/L3 summaries
e. formatter.py renders Markdown
Step 4: User sees structured minutes with decision + todo
Step 5: Next step: try audio → requires ASR backend (see ASR Backends section)
```
## Architecture
```
minutes-taker/
├── SKILL.md
├── scripts/
│ ├── asr.py # ASR audio transcription (pluggable backends)
│ ├── input.py # Input parsing (audio/text/chat)
│ ├── segmenter.py # Topic segmentation
│ ├── extractor.py # Decision/todo/risk/idea/data extraction
│ ├── summarizer.py # Three-level summary generation
│ ├── decision_chain.py # Cross-meeting decision tracking
│ ├── todo_tracker.py # Todo lifecycle management
│ ├── formatter.py # Minutes formatting
│ └── storage.py # Local minutes storage & retrieval
└── references/
└── examples.json # Sample inputs/outputs
```
## Pipeline
```
Input (text/audio/chat)
│
▼
input.py ──► Parsed Input (content + speakers)
│
▼
segmenter.py ──► Topic Segments
│
▼
extractor.py ──► Decisions, Todos, Risks, Ideas, Dates, Data
│
▼
summarizer.py ──► L1, L2, L3 Summaries
│
▼
formatter.py ──► Markdown / Text / HTML
│
▼
storage.py ──► Saved to ~/.openclaw/data/minutes-taker/
```
## Error Handling
| Code | Scenario | Action |
|------|----------|--------|
| E001 | Audio file not found | Error + path check |
| E002 | Unsupported audio format | List supported formats + convert via ffmpeg |
| E003 | ASR processing failure | Error + offer manual text input; list available backends |
| E004 | LLM timeout | Fall back to basic template |
| E005 | LLM format error | Retry 1x, then return raw text |
| E006 | Export API unavailable | Save locally + retry hint |
| E007 | History data unavailable | Skip cross-refs, generate current |
| E008 | Empty participants list | Auto-detect from content |
## Security
- **ASR backends vary**: whisper is fully offline (no network upload); Google Speech API sends audio to Google servers
- **Privacy notice**: When using speech_recognition backend, audio data is transmitted to Google for transcription
- **Local storage**: Minutes stored at `~/.openclaw/data/minutes-taker/` with configurable directory
- **File permissions**: Minutes files default to 600 (user-only read/write)
- **Sensitive detection**: Marks potential sensitive content (salary, HR topics) with ⚠️
- **LLM context splitting**: Sends meeting text by topic segments, not the full transcript at once
## Dependencies
- **Python 3.10+**
- **ffmpeg** (for audio conversion)
- **Optional**: `openai-whisper` for local offline ASR
- **Pre-installed**: `SpeechRecognition` for Google Speech API ASR
don't have the plugin yet? install it then click "run inline in claude" again.