How do I turn a video (or topic) + a headshot into 3-5 on-brand, high-CTR YouTube thumbnails the way a top agency would: study the channel's winning style, pick the sharpest hook, generate a cinematic background, cut out the real person, and composite crisp bold text + badges as true layers
the agent that answers this
Raw image to viral thumbnail. Turn a video (or topic) + a headshot into 3-5 on-brand, high-CTR YouTube thumbnails the way a top agency would: study the channel's winning style, pick the sharpest hook, generate a cinematic background, cut out the real person, and composite crisp bold text + badges as true layers.
- Cost
- Free
- on your own plan
- Runs
- On demand
- or scheduled
- Built from
- 11 steps
- 2 verified skills
- Runs in
- Claude or Codex
- as you
the steps
11 steps · 2 from verified skills- Step 1tool
Preflight: ensure tooling is ready. Free CLIs (auto-install if missing): Remotion (npx), ffmpeg, rembg via `uvx --from "rembg[cpu,cli]" rembg ...` (downloads u2net model first run; the [cli] extra alone fails, use [cpu,cli]). Paid: Runway ML API key (dev.runwayml.com). NOTE: gen4_image promptText must be condensed (~500 chars) or it 400s; Impact font resolves on macOS Chromium for the text layer. If a paid key is missing, stop and tell the user what to get + the cheaper alternative.
Your model fills this step- No integration? npm i remotion
- No integration? uvx rembg[cpu,cli]
- Step 2decision
Confirm the inputs: the source video (or topic) this thumbnail is for, the subject headshot to composite, and 2-3 of the channel's best existing thumbnails to match the house style.
Decision step - Step 3
Study the channel's existing thumbnails + this video's transcript/topic. Extract the repeatable viral pattern (bold split/contrast layout, ONE huge number-or-word hook, the face cut-out, SOLD/check/cross badges, arrows, white+yellow heavy Impact text) AND the single sharpest curiosity-gap hook for THIS video.
youtube-video-analyst — an installable skill for AI agents, published by shipshitdev/library.
Forensic deconstruction of YouTube videos to extract viral formulas, hooks, and retention mechanics. Analyzes video transcripts across 11 systematic sections: hook architecture, structural blueprint, retention mechanics, emotional…
Full skill: youtube-video-analyst - Step 4decision
Generate 3-5 distinct thumbnail concepts (each = layout + a 3-6 word hook + accent color), scored for click-through: curiosity gap, contrast/legibility at small size, emotional face, on-brand. Pick the top concepts to produce.
Decision step - Step 5decision
PROMPT AGENT: for each chosen concept, write (a) the background image prompt — photoreal cinematic scene, NO text, NO faces, brand motif (e.g. compass), a clean/darker band reserved for the face composite, condensed to ~500 chars; and (b) the exact text + badge/arrow spec (strings, fonts, colors, positions, shadows).
Decision step - Step 6tool
Generate each background image via the Runway image API (gen4_image, 1920x1080, no text/no face) and download locally.
Your model fills this step- No integration? Nano Banana / gemini_2.5_flash via Runway
- No integration? gen4_image_turbo with a brand reference image
- No integration? hand-built gradient/stock background
- Step 7tool
Cut the subject headshot to a clean transparent PNG (rembg) and cache it for reuse on future thumbnails.
Asset preprocessing for HyperFrames compositions — text-to-speech narration (Kokoro), audio/video transcription (Whisper), and background removal for transpa...
--- name: hyperframes-media description: Asset preprocessing for HyperFrames compositions — text-to-speech narration (Kokoro), audio/video transcription (Whisper), and background removal for transparent overlays (u2net). Use when…
Full skill: Hyperframes Media- No integration? rembg via uvx --from "rembg[cpu,cli]"
- No integration? remove.bg / Photoroom
- No integration? ffmpeg chromakey on a solid backdrop
- Step 8tool
FACE TOUCH-UP — always produce BOTH treatments and carry both forward: (a) FREE ffmpeg color-grade on the cut-out (contrast + warmth + saturation + sharpen, preserve alpha) — identity-perfect, $0; (b) Nano Banana (gemini_2.5_flash) image-to-image relight using the headshot as a reference image — studio relight + warm rim/edge light + clean skin retouch, KEEP the exact same face/identity/outfit — then rembg the relit result. The graded face is the safe default; the relit face often reads better/more 'pro' on YouTube.
Your model fills this step- No integration? if Runway is unavailable or the relit likeness drifts, ship the free-graded face only
- Step 9tool
Composite in Remotion (1280x720 still): background + face cut-out + bold Impact text (white + yellow, heavy drop shadow) + badges/arrows as crisp layers. Render BOTH a graded-face and a relit-face version of each chosen concept.
Your model fills this step- No integration? Pillow/Canvas or HTML-to-screenshot if Remotion is unavailable
- Step 10decision
Quality gate: critique each thumbnail as a domain expert, a skeptic, and the end user — legible at small size, on-brand, real likeness preserved (especially the relit face), text crisp. Fix clear problems in place.
Decision step - Step 11decision
Present the variants side by side — each hook in BOTH the free-graded and the AI-relit face treatment — for the creator to pick and tweak (hook/colors/face pose/badges) before publishing. No auto-publish.
Decision step
common questions
How do I turn a video (or topic) + a headshot into 3-5 on-brand, high-CTR YouTube thumbnails the way a top agency would: study the channel's winning style, pick the sharpest hook, generate a cinematic background, cut out the real person, and composite crisp bold text + badges as true layers?
The Raw image to viral thumbnail agent. 3-5 finished 1280x720 thumbnails (background + real face cut-out + razor-sharp Impact text + badges/arrows), each a distinct hook, ready to pick and publish.
Is the Raw image to viral thumbnail agent free?
Yes. It runs on the Claude or Codex subscription you already pay for, so there is no extra AI bill and no per-run charge. You can build and run unlimited agents on the free plan.
How often does the Raw image to viral thumbnail agent run?
You choose: run it on demand, or put it on a schedule (hourly, daily, weekly). Once scheduled it runs unattended, as you, on your own machine.
What does the Raw image to viral thumbnail agent need to run?
Install Implexa into your Claude or Codex, then connect Claude for Chrome so it can gather its own data and deliver hands-free. Implexa never touches your accounts or credentials.
Does the Raw image to viral thumbnail agent use my data? Is it private?
It runs as you, on your own machine, on your real data. The model runs inside your own Claude or Codex, so Implexa never sees your data, accounts, or credentials. Your agent's memory is yours and travels with you across Claude, Codex, and whatever comes next.
How do I build the Raw image to viral thumbnail agent?
Install Implexa into your Claude or Codex, then say "build the Raw image to viral thumbnail agent" and approve the schedule. Implexa assembles the 11 steps (2 from verified skills) and it runs on its own. About 5 minutes to your first real run.
Can I change what the Raw image to viral thumbnail agent does?
Yes. Tell it what to change in plain language and it revises its steps; the next scheduled run uses the change, with no re-scheduling. Every change is versioned, and a run can even propose its own improvements.
changelog
- v2Jun 14manual
added 1 step; rebound step 1 (skills.sh/firebase-basics -> gap), step 9 (decision -> gap)
- v1Jun 14generated
auto-generated from "Turn a video (or topic) + a headshot into 3-5 on-brand, high-CTR YouTube thumbna"
Agents are alive: every change is a version, and a run can propose improvements that get reviewed and applied.
related agents
- dailycreator
How do I every day, run a multi-persona debate over the day's research to pick the best reel angle, then produce the reel brief and the IG caption, ready for your avatar render
Daily IG reel: debate and brief
- dailycreator
How do I produce my daily on-brand vertical IG reel end to end — research the day's angle, write the avatar script, generate the avatar video, render it with captions, and prep the caption + guide link, held for my approval before posting
Daily IG reel: produce and render
- dailycreator
How do I grow my Instagram
Daily IG reel research bundle
- creator
How do I turn a raw talking-head video (e.g. a HeyGen avatar clip) into a polished, on-brand vertical reel with karaoke captions and rebuilt UI b-roll, then post it to Instagram
Raw talking-head video to on-brand reel