>
MiniMax Multi-Modal Toolkit Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases (audio + music + video + image). Includes voice cloning & voice design for custom voices, image generation with character reference, and FFmpeg-based media tools for audio/video format conversion, concatenation, trimming, and extraction. Setup & Configuration Prerequisites brew install ffmpeg jq # macOS sudo apt install ffmpeg jq # Linux (Debian/Ubuntu) bash scripts/check_environment.sh # verify environment No Python or pip required — all scripts are pure bash using curl, ffmpeg, jq, and xxd. Note: ffmpeg is required for TTS voice bubble conversion (.mp3 → .opus). Without it, TTS audio sends as a file attachment instead of a native voice bubble. API Configuration
don't have the plugin yet? install it then click "run inline in claude" again.