Item: hyperframes-media
Rating: 5.1
Author: Implexa

hyperframes-media

Audio and media assets for HyperFrames compositions, produced by one shared audio engine (`scripts/audio.mjs`) — multi-provider TTS (HeyGen / ElevenLabs /…

installs

stars

karma

SkillRank score ↗

5.1/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-06-07

hyperframes-media provides three local cli commands for media preprocessing: text-to-speech with 54 multilingual voices, audio transcription with word-level timestamps, and video background removal for transparent overlays. no api keys required.

structure

6.0

trigger phrases

7.0

procedure

4.0

edge cases

3.0

documentation

5.0

strengths

SKILL.md

Generate speech, transcribe audio with timestamps, and remove video backgrounds for transparent overlays.

Three CLI commands (tts, transcribe, remove-background) that each download and cache their own model on first run; no API keys required

Text-to-speech supports 54 multilingual voices (American, British, Spanish, French, Hindi, Italian, Japanese, Portuguese, Mandarin) with speed control; auto-detects language from voice prefix

Transcription produces word-level timestamps in normalized JSON; supports multiple input formats (audio, video, SRT/VTT, OpenAI responses) with configurable Whisper model sizes and explicit language selection to prevent silent translation errors

Background removal outputs VP9 WebM with alpha channel (or ProRes/PNG) for transparent overlays; optional --background-output flag creates a hole-cut inverse layer for compositing text or graphics between subject and background

HyperFrames Media

Create the audio and media assets a composition needs — voiceover (TTS), background music + sound effects, transcription, captions, background removal — then consume and animate that data in HTML. For placing assets into compositions, see hyperframes-core.

The audio engine — one source for TTS · BGM · SFX

Workflows do NOT hand-roll audio or vendor a copy. There is one engine — scripts/audio.mjs — that takes a neutral audio_request.json and writes audio_meta.json (plus assets under assets/voice|bgm|sfx):

# <MEDIA_DIR> = this skill's directory
node <MEDIA_DIR>/scripts/audio.mjs --request ./audio_request.json --hyperframes . --out ./audio_meta.json

All three capabilities degrade on ONE switch — whether a HeyGen credential is present (resolved from $HEYGEN_API_KEY / $HYPERFRAMES_API_KEY / ~/.heygen, not the CLI):

don't have the plugin yet? install it then click "run inline in claude" again.

hyperframes-media

SKILL.md

related skills