Convert meeting transcripts into structured minutes with decision tracking, task assignment, three-tier summaries, and cross-meeting correlation.
SKILL.md

---
name: minutes-taker
display_name: AI Meeting Minutes
description: Convert meeting transcripts into structured minutes with decision tracking, task assignment, three-tier summaries, and cross-meeting correlation.
author: harry
version: 1.1.0
license: MIT
type: skill
tags:
  - meeting-minutes
  - productivity
  - documentation
  - collaboration
model: deepseek
created_at: 2026-06-14
---

# AI Meeting Minutes (minutes-taker)

Convert meeting transcripts or text into structured, actionable minutes. The skill extracts decisions, action items, risks, and key dates, generates three-tier summaries (one-liner → three-para → full), and tracks decisions and todos across meetings.

**Core strengths**: Decision chain tracking across meetings, todo lifecycle management (extract → assign → track → remind), and three-level summaries for different consumption contexts.

**Audio support**: ASR (speech-to-text) is available via plugin backends. See [ASR Backends](#asr-backends) below.

## Quick Start

```bash
clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday."
```

This produces structured minutes with decisions, todos, and a three-level summary.

## Features

| Feature | Description |
|---------|------------|
| **Input Modes** | Plain text, chat logs, structured forms. Audio via ASR backend (see below) |
| **ASR (Speech-to-Text)** | Pluggable: whisper (offline) or SpeechRecognition (Google API). No diarization by default |
| **6 Extractors** | Decisions, Todos, Dates, Risks, Ideas, Data Points |
| **3-Level Summaries** | L1 one-liner, L2 three-paragraph, L3 full minutes |
| **Decision Chain** | Track decisions across meetings with evolution history |
| **Todo Lifecycle** | Auto-extract, assign, deadline parse, priority infer, track across meetings |
| **Cross-meeting Links** | Detect recurring topics, link related decisions and todos |
| **Multiple Outputs** | Markdown, text, HTML, feishu, notion (via extensions) |

## ASR Backends

The `asr.py` module auto-detects available backends at runtime:

| Backend | Quality | Offline | Requirements |
|---------|---------|---------|-------------|
| **whisper** (openai-whisper) | ⭐⭐⭐ High | ✅ Yes | `pip install openai-whisper` |
| **speech_recognition** (Google API) | ⭐⭐ Medium | ❌ No | `pip install SpeechRecognition` (pre-installed) |

Transcription is attempted in priority order: whisper → speech_recognition.
Speaker diarization is NOT performed; the system labels all text as a single speaker.
For multi-speaker audio, provide a participants list or use post-processing.

## Input Format

```json
{
  "input": {
    "type": "text",
    "content": "Zhang: Today we discuss...",
    "format": "chat_log"
  },
  "meeting_context": {
    "title": "Q2 Product Roadmap Review",
    "date": "2026-06-14",
    "participants": [
      {"name": "Zhang San", "role": "PM"},
      {"name": "Li Si", "role": "Frontend Dev"}
    ],
    "agenda": ["Payment module", "Growth tools"]
  },
  "options": {
    "summary_level": "full",
    "extract_todos": true,
    "extract_decisions": true,
    "output_format": "markdown"
  }
}
```

## Sample Prompts

### Prompt 1: Text-based Meeting Minutes (Quick Start)
```bash
clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss the payment module. Li: I suggest using Stripe. Wang: Agreed. Zhang: OK, Li to deliver the proposal by Friday." \
  --title "Payment Module Discussion" \
  --participants "Zhang San (PM), Li Si (FE), Wang Wu (BE)"
```
**Expected output**: Structured minutes with:
- 📌 Summary: Payment module discussion → Stripe chosen, Li to deliver
- ✅ Todos: Li Si - deliver Stripe proposal by Friday (🔴 High)
- 📊 Decisions: Use Stripe SDK (Zhang proposed, unanimous)
- L1: "Decided to use Stripe for payment module; Li to submit proposal by Friday"
- L2/L3: Full expanded minutes

### Prompt 2: Audio File Processing
```bash
clawhub run minutes-taker --input ./meeting-2026-06-14.m4a \
  --title "Q2 Product Roadmap Review" \
  --participants @team.json
```
**Expected output**: Transcribed text with structured minutes (decisions, todos, risks, summary).
Requires an ASR backend (see [ASR Backends](#asr-backends)). If no backend is available, returns a clear error guiding installation.

### Prompt 3: Todo Tracking
```bash
clawhub run minutes-taker todos --since last_meeting
```
**Expected output**:
```
📋 Previous Meeting Todo Tracking (2026-06-07)
✅ Complete (3/5):
  ✅ Li Si · Payment frontend tech proposal (PR #2341 submitted)
⏳ In Progress (1/5):
  ⏳ Zhao Liu · Growth tool prototype (60%, due 06/16)
❌ Overdue (1/5):
  ❌ Zhang San · Competitive analysis report (3 days overdue)
```

### Prompt 4: Decision Chain
```bash
clawhub run minutes-taker decisions --topic "Payment Module"
```
**Expected output**:
```
📜 Decision Chain: Payment Module Refactor
06/01 Weekly: "We need to refactor payment module" (Zhang)
06/07 Tech Review: Chose Stripe SDK (Wang proposed, 3:2 passed)
06/14 Roadmap Review: Launch date set to July 20, added stress test → [This meeting]
```

### Prompt 5: Three-Level Summary
```bash
clawhub run minutes-taker --input-type text --content "$(cat transcript.txt)" \
  --summary-level three_para --output-format text
```
**Expected output**: Concise three-paragraph summary ready for WeChat/email.

## First-Success Path

**Goal**: Structured minutes from 3 lines of dialogue within 30 seconds.

```
Step 1: clawhub install minutes-taker
Step 2: clawhub run minutes-taker --input-type text \
  --content "Zhang: Today we discuss payment. Li: I suggest Stripe. Wang: OK. Zhang: Li, make proposal by Friday."
Step 3: Internal pipeline:
  a. input.py parses text, identifies speakers
  b. segmenter.py (single topic, no split needed)
  c. extractor.py extracts: 1 decision, 1 todo, 0 risks
  d. summarizer.py generates L1/L2/L3 summaries
  e. formatter.py renders Markdown
Step 4: User sees structured minutes with decision + todo
Step 5: Next step: try audio → requires ASR backend (see ASR Backends section)
```

## Architecture

```
minutes-taker/
├── SKILL.md
├── scripts/
│   ├── asr.py             # ASR audio transcription (pluggable backends)
│   ├── input.py           # Input parsing (audio/text/chat)
│   ├── segmenter.py       # Topic segmentation
│   ├── extractor.py       # Decision/todo/risk/idea/data extraction
│   ├── summarizer.py      # Three-level summary generation
│   ├── decision_chain.py  # Cross-meeting decision tracking
│   ├── todo_tracker.py    # Todo lifecycle management
│   ├── formatter.py       # Minutes formatting
│   └── storage.py         # Local minutes storage & retrieval
└── references/
    └── examples.json       # Sample inputs/outputs
```

## Pipeline

```
Input (text/audio/chat)
    │
    ▼
input.py ──► Parsed Input (content + speakers)
    │
    ▼
segmenter.py ──► Topic Segments
    │
    ▼
extractor.py ──► Decisions, Todos, Risks, Ideas, Dates, Data
    │
    ▼
summarizer.py ──► L1, L2, L3 Summaries
    │
    ▼
formatter.py ──► Markdown / Text / HTML
    │
    ▼
storage.py ──► Saved to ~/.openclaw/data/minutes-taker/
```

## Error Handling

| Code | Scenario | Action |
|------|----------|--------|
| E001 | Audio file not found | Error + path check |
| E002 | Unsupported audio format | List supported formats + convert via ffmpeg |
| E003 | ASR processing failure | Error + offer manual text input; list available backends |
| E004 | LLM timeout | Fall back to basic template |
| E005 | LLM format error | Retry 1x, then return raw text |
| E006 | Export API unavailable | Save locally + retry hint |
| E007 | History data unavailable | Skip cross-refs, generate current |
| E008 | Empty participants list | Auto-detect from content |

## Security

- **ASR backends vary**: whisper is fully offline (no network upload); Google Speech API sends audio to Google servers
- **Privacy notice**: When using speech_recognition backend, audio data is transmitted to Google for transcription
- **Local storage**: Minutes stored at `~/.openclaw/data/minutes-taker/` with configurable directory
- **File permissions**: Minutes files default to 600 (user-only read/write)
- **Sensitive detection**: Marks potential sensitive content (salary, HR topics) with ⚠️
- **LLM context splitting**: Sends meeting text by topic segments, not the full transcript at once

## Dependencies

- **Python 3.10+**
- **ffmpeg** (for audio conversion)
- **Optional**: `openai-whisper` for local offline ASR
- **Pre-installed**: `SpeechRecognition` for Google Speech API ASR
Minutes Taker

SKILL.md

related skills