Name: whisper-transcription
Availability: InStock
Author: guia-matthieu

Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from…

SKILL.md

Whisper Transcription

Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features.

When to Use This Skill

Podcast repurposing - Convert episodes to blog posts, show notes, social snippets

Video subtitles - Generate SRT/VTT files for YouTube, social media

Interview extraction - Pull quotes and insights from recorded calls

Content audit - Make audio/video libraries searchable

Translation - Transcribe and translate foreign language content

What Claude Does vs What You Decide

Claude Does
You Decide

Structures production workflow
Final creative direction

Suggests technical approaches
Equipment and tool choices

Creates templates and checklists
Quality standards

Identifies best practices
Brand/voice decisions

Generates script outlines
Final script approval

Dependencies

pip install openai-whisper torch ffmpeg-python click
# Also requires ffmpeg installed on system
# macOS: brew install ffmpeg
# Ubuntu: sudo apt install ffmpeg

Commands

Transcribe Single File

python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt
python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt

Batch Transcription

python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/

Transcribe + Translate

python scripts/main.py translate foreign-audio.mp3 --to en

Extract Timestamps

python scripts/main.py timestamps podcast.mp3 --format json

Examples

Example 1: Podcast to Blog Post

# Transcribe 1-hour podcast
python scripts/main.py transcribe episode-42.mp3 --model medium

# Output: episode-42.txt (full transcript with timestamps)
# Processing time: ~5 min for 1 hour audio on M1 Mac

Example 2: YouTube Subtitles

# Generate SRT for video upload
python scripts/main.py transcribe marketing-video.mp4 --format srt

# Output: marketing-video.srt
# Upload directly to YouTube/Vimeo

Example 3: Batch Process Interview Library

# Transcribe all recordings in folder
python scripts/main.py batch ./customer-interviews/ --model small --format txt

# Output: ./customer-interviews/*.txt (one per audio file)

Model Selection Guide

Model
Speed
Accuracy
VRAM
Best For

tiny
Fastest
~70%
1GB
Quick drafts, short clips

base
Fast
~80%
1GB
Social media clips

small
Medium
~85%
2GB
Podcasts, interviews

medium
Slow
~90%
5GB
Professional transcripts

large
Slowest
~95%
10GB
Critical accuracy needs

Recommendation: Start with small for most marketing content. Use medium for client deliverables.

Output Formats

Format
Extension
Use Case

txt
.txt
Blog posts, analysis

srt
.srt
Video subtitles (YouTube)

vtt
.vtt
Web video subtitles

json
.json
Programmatic access

tsv
.tsv
Spreadsheet analysis

Performance Tips

GPU acceleration - 10x faster with CUDA GPU

Audio extraction - Script auto-extracts audio from video

Chunking - Long files auto-split for memory efficiency

Language detection - Automatic, or specify with --language

Skill Boundaries

What This Skill Does Well

Structuring audio production workflows

Providing technical guidance

Creating quality checklists

Suggesting creative approaches

What This Skill Cannot Do

Replace audio engineering expertise

Make subjective creative decisions

Access or edit audio files directly

Guarantee commercial success

Related Skills

video-processing - Extract audio from video

youtube-downloader - Download videos to transcribe

content-repurposer - Transform transcripts to content

podcast-production - Create podcasts

Skill Metadata

Mode: cyborg

category: automation
subcategory: audio-processing
dependencies: [openai-whisper, torch, ffmpeg-python]
difficulty: beginner
time_saved: 10+ hours/week

whisper-transcription

SKILL.md

related skills