Use when (1) user provides a UI screenshot or image and asks to convert it into HTML, CSS, or component code. (2) user says "turn this into code", "rebuild t...
---
name: ai-image-to-code
description: >
Use when (1) user provides a UI screenshot or image and asks to convert it into HTML, CSS, or component code.
(2) user says "turn this into code", "rebuild this UI", "code this design", or "generate HTML from screenshot".
(3) user pastes an image and says "write the React component for this".
license: MIT
metadata:
version: "1.0"
category: design
author: wangjipeng
sources:
- https://github.com/MiniMax-AI/skills
---
# AI Image to Code
Use when (1) user provides a UI screenshot or image and asks to convert it into HTML, CSS, or component code. (2) user says "turn this into code", "rebuild this UI", "code this design", or "generate HTML from screenshot". (3) user pastes an image and says "write the React component for this".
## Core Position
This skill solves the specific problem of: *a visual UI mockup needs to become actual runnable frontend code — not just a description, but a working implementation.*
This skill IS NOT:
- An image generation tool — it converts existing images to code, not creates images
- A design tool — it interprets and codes a design, not create the design
- A backend integration tool — it outputs HTML/CSS/JS, not server code
This skill IS activated ONLY when: image (screenshot/mockup) + code generation intent are both present.
## Modes
### `/ai-image-to-code`
**Default mode.** Converts a UI image into a complete HTML/CSS implementation.
When to use: User provides a screenshot and wants a working HTML page that resembles it.
### `/ai-image-to-code/react`
Outputs a React functional component using Tailwind CSS.
When to use: User explicitly asks for React or a component, not a plain HTML page.
### `/ai-image-to-code/describe`
Provides a detailed text description of the layout without writing code.
When to use: User only wants to understand the layout before committing to code generation.
## Execution Steps
### Step 1 — Analyze the Image
1. Receive image (pasted, file attachment, or URL)
2. Use vision model to inspect the image and extract:
- Layout structure (header, sidebar, main content, footer)
- Color palette (primary, secondary, background, text, accent)
- Typography (headings, body, labels — size and weight hierarchy)
- Spacing system (padding, margins, gaps)
- Component types (buttons, inputs, cards, lists, navigation)
- Visual hierarchy (what stands out, what recedes)
3. If the image is complex (>10 distinct UI sections), focus on the main content area
### Step 2 — Plan the Code Structure
| Image Content | Recommended Output |
|---|---|
| Landing page | Single HTML with embedded CSS |
| Dashboard | HTML + CSS grid layout |
| Mobile app screen | Mobile-first responsive HTML |
| Form / login page | Semantic HTML form with proper inputs |
| Card / list UI | Component-based HTML with classes |
| Chart / data visualization | SVG or canvas-based rendering |
### Step 3 — Generate Code
**HTML/CSS output** (default):
```html
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>UI</title>
<style>
/* Extracted colors, typography, spacing from image */
</style>
</head>
<body>
<!-- Structure matching the image layout -->
</body>
</html>
```
**React + Tailwind** (react mode):
```jsx
export function UICard() {
return (
<div className="p-6 bg-white rounded-xl shadow-sm">
{/* Component matching image */}
</div>
);
}
```
### Step 4 — Validate
- Key layout sections (header, main, sidebar) are present
- Colors are within ±10% of the original image (subjective match)
- No invented content — placeholder text is generic ("Card title", not specific brand names)
- HTML is valid (proper tag nesting, no unclosed tags)
## Mandatory Rules
### Do not
- Do not invent brand names, specific product names, or proprietary text not visible in the image
- Do not claim the output is pixel-perfect — it is an interpretation
- Do not generate backend code, JavaScript logic, or API calls
- Do not reproduce copyrighted UI elements (logos, icons) — use generic equivalents
### Do
- Use placeholder text that fits the context (e.g., "Search..." for a search bar)
- Preserve the visual hierarchy (primary > secondary > tertiary)
- Use realistic placeholder data for images (e.g., via placeholder.com or picsum)
- State explicitly: "This is an approximation; fine-tune colors and spacing as needed"
## Quality Bar
**A good output:**
- All major layout regions are present and positioned correctly
- Color palette is recognizably derived from the image
- Typography hierarchy matches (heading size > body size)
- Code is valid, runnable HTML/CSS without external dependencies beyond a CDN
**A bad output:**
- Layout is scrambled or missing major sections
- Output includes broken or unclosed HTML tags
- Fabricated text content not appropriate to the UI context
- Output requires non-free dependencies or local asset files
## Good vs. Bad Examples
| Scenario | Bad Output | Good Output |
|---|---|---|
| E-commerce product card | Generic lorem ipsum text | "Price: $49.99 — Add to Cart" contextually appropriate |
| Dark mode UI | Ignores dark theme | Uses dark background, light text, correct contrast |
| Mobile screenshot | Desktop-only output | `max-width: 375px` container, mobile-first |
| Complex dashboard | One undifferentiated div | Grid layout with sidebar, header, main panels |
## References
- `references/` — Color extraction heuristics, layout structure patterns, Tailwind class mapping guidedon't have the plugin yet? install it then click "run inline in claude" again.
added explicit inputs section with vision model and external service dependencies, restructured procedure into granular steps with clear in-out contracts, expanded decision points to cover ambiguity, low quality images, copyright issues, scope boundaries, and mobile-responsive intent, and defined output contract with file formats and quality bars plus outcome signals for validation.
convert a UI screenshot, wireframe, or design mockup into runnable frontend code (HTML, CSS, React, or framework-agnostic components). use this when a user provides a visual image and asks to "turn this into code", "rebuild this UI", "code this design", "generate HTML from this screenshot", or "write the React component for this". the output is actual working code that approximates the visual layout and styling, not a description or design artifact.
optional external inputs:
output: confirmed image file or URL; metadata (dimensions, device type)
use vision model to inspect image and extract:
if image is complex (>10 distinct UI sections or multiple nested layouts):
flag any ambiguities:
output: detailed visual inventory (structured list of sections, colors as hex/rgb, typography scale, spacing notes, component list)
based on image content and user intent, choose output format:
| image content | recommended output | rationale |
|---|---|---|
| landing page hero, marketing site | single HTML file with embedded CSS | self-contained, minimal dependencies |
| dashboard, admin panel, data-heavy UI | HTML + CSS with grid/flexbox layout | card-based, reusable component structure |
| mobile app screen, small viewport | mobile-first responsive HTML, max-width 375-425px | constraint-driven design |
| form, login, checkout | semantic HTML form with proper inputs and labels | accessibility, progressive enhancement |
| card, list item, component preview | component-based JSX or HTML with CSS classes | reusable, composable |
| chart, graph, data viz | SVG or canvas rendering (if complex) | vector-based, scalable |
if user requested React mode, use JSX + Tailwind CSS utility classes. if no preference, default to HTML + CSS.
output: chosen format; justification
step 4a: HTML/CSS default mode
.card, .header-nav, .cta-button, not .box1, .div2)example scaffold:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>UI Component</title>
<style>
/* extracted colors, typography, spacing */
:root {
--color-primary: #...; /* from image */
--color-bg: #...;
--font-size-heading: ...;
}
/* layout and component styles */
</style>
</head>
<body>
<!-- major sections with semantic tags -->
</body>
</html>
step 4b: React + Tailwind mode
export function ProductCard() { ... })example scaffold:
export function UIComponent() {
return (
<div className="min-h-screen bg-gray-50 p-6">
<header className="bg-white rounded-lg shadow-sm p-4">
{/* header content */}
</header>
<main className="mt-6 grid grid-cols-3 gap-4">
{/* main content */}
</main>
</div>
);
}
output: runnable code snippet (HTML file or JSX component); code is valid, no unclosed tags, proper nesting
check HTML validity:
check visual alignment:
check content appropriateness:
check code quality:
if validation fails on any point, regenerate or note limitation to user.
output: validation checklist completed; any deviations flagged
output: user-ready code with caveats stated
if user provides image but no code format preference:
if image is ambiguous (colors hard to read, low contrast, poor quality):
if image contains copyrighted branding, logos, or proprietary UI:
if user requests backend code, API integration, or state management:
if image is a mobile screenshot and user doesn't specify responsive intent:
if placeholder.com or image CDN is unreachable or blocked:
if user requests pixel-perfect fidelity:
if image contains complex interactions (animations, modals, drag-drop):
format: