Transcribe audio and video files to text using OpenAI Whisper. Use when: converting podcasts to blog posts; creating video subtitles; extracting quotes from…
Whisper Transcription Transcribe any audio or video to text using OpenAI's Whisper model - the same technology powering ChatGPT voice features. When to Use This Skill Podcast repurposing - Convert episodes to blog posts, show notes, social snippets Video subtitles - Generate SRT/VTT files for YouTube, social media Interview extraction - Pull quotes and insights from recorded calls Content audit - Make audio/video libraries searchable Translation - Transcribe and translate foreign language content What Claude Does vs What You Decide Claude Does You Decide Structures production workflow Final creative direction Suggests technical approaches Equipment and tool choices Creates templates and checklists Quality standards Identifies best practices Brand/voice decisions Generates script outlines Final script approval Dependencies pip install openai-whisper torch ffmpeg-python click # Also requires ffmpeg installed on system # macOS: brew install ffmpeg # Ubuntu: sudo apt install ffmpeg Commands Transcribe Single File python scripts/main.py transcribe audio.mp3 --model medium --output transcript.txt python scripts/main.py transcribe video.mp4 --format srt --output subtitles.srt Batch Transcription python scripts/main.py batch ./recordings/ --format txt --output ./transcripts/ Transcribe + Translate python scripts/main.py translate foreign-audio.mp3 --to en Extract Timestamps python scripts/main.py timestamps podcast.mp3 --format json Examples Example 1: Podcast to Blog Post # Transcribe 1-hour podcast python scripts/main.py transcribe episode-42.mp3 --model medium # Output: episode-42.txt (full transcript with timestamps) # Processing time: ~5 min for 1 hour audio on M1 Mac Example 2: YouTube Subtitles # Generate SRT for video upload python scripts/main.py transcribe marketing-video.mp4 --format srt # Output: marketing-video.srt # Upload directly to YouTube/Vimeo Example 3: Batch Process Interview Library # Transcribe all recordings in folder python scripts/main.py batch ./customer-interviews/ --model small --format txt # Output: ./customer-interviews/*.txt (one per audio file) Model Selection Guide Model Speed Accuracy VRAM Best For tiny Fastest ~70% 1GB Quick drafts, short clips base Fast ~80% 1GB Social media clips small Medium ~85% 2GB Podcasts, interviews medium Slow ~90% 5GB Professional transcripts large Slowest ~95% 10GB Critical accuracy needs Recommendation: Start with small for most marketing content. Use medium for client deliverables. Output Formats Format Extension Use Case txt .txt Blog posts, analysis srt .srt Video subtitles (YouTube) vtt .vtt Web video subtitles json .json Programmatic access tsv .tsv Spreadsheet analysis Performance Tips GPU acceleration - 10x faster with CUDA GPU Audio extraction - Script auto-extracts audio from video Chunking - Long files auto-split for memory efficiency Language detection - Automatic, or specify with --language Skill Boundaries What This Skill Does Well Structuring audio production workflows Providing technical guidance Creating quality checklists Suggesting creative approaches What This Skill Cannot Do Replace audio engineering expertise Make subjective creative decisions Access or edit audio files directly Guarantee commercial success Related Skills video-processing - Extract audio from video youtube-downloader - Download videos to transcribe content-repurposer - Transform transcripts to content podcast-production - Create podcasts Skill Metadata Mode: cyborg category: automation subcategory: audio-processing dependencies: [openai-whisper, torch, ffmpeg-python] difficulty: beginner time_saved: 10+ hours/week
don't have the plugin yet? install it then click "run inline in claude" again.