Generate visually unified image-based PPT/PPTX decks from articles, reports, papers, notes, or outlines.

SKILL.md

---
name: codex-ppt
description: Generate visually unified image-based PPT/PPTX decks from articles, reports, papers, notes, or outlines.
version: 0.5.3
metadata:
openclaw:
requires:
bins:
- python3
primaryEnv: OPENAI_API_KEY
envVars:
- name: OPENAI_API_KEY
required: false
description: API key for CLI fallback.
- name: OPENAI_BASE_URL
required: false
description: API base URL.
- name: CODEX_PPT_IMAGE_MODEL
required: false
description: Image model, defaults to gpt-image-2.
- name: CODEX_PPT_HOME
required: false
description: Runtime home override.
homepage: https://github.com/ningzimu/codex-ppt-skill
---

# Codex PPT

## Overview

This skill creates image-based PowerPoint decks from source material. Each slide is a complete 16:9 generated image. Final images are assembled into `.pptx` with `scripts/assemble_ppt.py`.

Use this when the user wants a visually unified presentation and accepts full-slide image pages. Do not use it when every textbox, chart, or shape must remain separately editable.

Prefer the built-in image generation/editing tool. Use `scripts/image_gen.py` only when the built-in backend is unavailable, lacks a required capability, or the user explicitly asks for API/CLI mode.

## Hard Constraints

- Read the relevant `Reference Map` files before each phase. This file is the orchestration contract; detailed rules live in `docs/` and worker prompts in `prompts/`.
- Respect approval gates. Do not create final `deck_spec.json`, `speech.md`, prompt jobs, slide images, or `.pptx` before the approvals in `docs/workflow-gates-and-progress.md`.
- After the user approves the sample slide and authorizes full-deck generation, every remaining slide image job must be dispatched to a slide subagent whenever subagents are available.
- The main agent owns orchestration, prompt jobs, state recording, QA, speaker notes, and assembly. Do not silently replace available slide subagents with sequential production.
- Every final `origin_image/slide_XX.png` must be generated by the selected image backend: built-in image generation/editing tool or `scripts/image_gen.py`.
- Local drawing, Pillow, SVG, HTML/CSS/canvas screenshots, python-pptx/PptxGenJS layouts, and manual overlays are failure modes, not fallbacks.
- The selected image backend must stay fixed after backend confirmation. Do not let subagents switch backend for convenience.
- After sample approval, record how the approved sample was generated and pass that exact method to every slide subagent.
- Slide dispatch and result state must be recorded with the bundled scripts. Chat messages alone do not make a slide dispatched or complete.
- If a required subagent, image backend, or required-image path is unavailable, stop and report a blocker with the slide id and evidence. Do not create a lower-quality replacement.

## Visible Progress

For non-trivial decks, keep a user-visible checklist with one active step. Canonical completion evidence is in `docs/workflow-gates-and-progress.md`.

Default visible steps:

1. Prepare source, outline, style, and backend decisions.
2. Generate and approve one sample slide.
3. Prepare slide jobs and slide state.
4. Dispatch slide subagents.
5. Record generated slide results.
6. QA, repair, notes, and PPT assembly.

Do not mark a step complete from chat alone; use real files or script-recorded state.

## Default Workflow

1. Understand the source content.
- Identify topic, audience, goal, page count, style/brand constraints, and sections to include or exclude.
- If no page count is specified, choose a practical count. Typical decks are 8-12 slides.

2. Plan the deck outline.
- Before writing or updating `outline.md`, read `docs/workflow-gates-and-progress.md` and `docs/outline-style-and-sample.md`.
- Draft slide roles and required source images. Ask for confirmation, then stop before style, backend, sample, or downstream artifacts until approved.

3. Confirm a unified visual style.
- Before offering style options or using files from `references/`, read `docs/outline-style-and-sample.md`.
- Offer 2-3 concrete style directions, recommend one, wait for confirmation, then keep one visual identity while varying layouts by page role.

4. Confirm the image backend.
- Before generating any slide image, read `docs/backend-selection.md`.
- Check whether a built-in image tool is callable, state what you checked, name the backend, explain fallback status, and wait for confirmation.
- If CLI/API fallback is selected, read `docs/cli-api-fallback.md`. Read `docs/image-model-configuration.md` only after config errors or explicit API-setting requests.

5. Generate one sample slide for approval.
- Before generating or approving the sample slide, read `docs/outline-style-and-sample.md`.
- Generate exactly one representative sample after outline, style, and backend are confirmed. Do not generate the full deck until approved.
- After approval, record `sample_generation_method` in `deck_spec.json` so jobs and subagents inherit the same path.

6. Create the project directory.
- Before initializing folders or assembling files, read `docs/project-assembly-and-reporting.md`.
- If no destination is specified, use the current working directory or the source file directory.

7. Prepare user-supplied assets.
- Before using paper figures, charts, screenshots, logos, or other required assets, read `docs/user-supplied-assets.md`.
- Treat required assets as strict inputs and confirm slide-to-asset mapping before generation.

8. Generate all slide images.
- Before full-deck image generation, read `docs/slide-generation-and-subagents.md`.
- Create per-slide jobs with `scripts/prepare_slide_prompts.py` or saved `prompts/slide_XX.json` files.
- Every final image must come from the selected backend and be recorded with bundled state scripts.

9. Dispatch slide subagents.
- Before dispatching or replacing slide workers, read `docs/slide-generation-and-subagents.md` and `prompts/slide-worker.md`.
- Use one subagent per remaining slide job whenever possible. If required subagents cannot be spawned, stop and report a blocker unless the user changes the workflow.

10. Quality check and repair.
- Before QA or assembly, read `docs/project-assembly-and-reporting.md`.
- Inspect every slide before assembly: text, outline match, truncation, style, unwanted page numbers, overlaps, and required assets.
- Regenerate severe failures with a tighter prompt. Use backend editing for localized issues when available.
- For CLI/API fallback edit commands, read `docs/cli-api-fallback.md`. Replace the final slide only after validating the edited output.

11. Write speaker notes and assemble the PPT.
- Before writing `speech.md` or running assembly, read `docs/project-assembly-and-reporting.md`.
- Make sure `outline.md` reflects the final confirmed deck outline. Use `speech.md` headings that map to `Slide N`.
- Before assembly, ensure `slide_jobs.json` shows generated slides as `recorded` and approved samples as `accepted`. If any slide is `pending`, `dispatched`, or `blocked`, stop.

12. Report the result.
- Use the final report checklist in `docs/project-assembly-and-reporting.md`.
- Include paths, slide count, backend used, recorded-result status, and any limitations or blockers.

13. Save reusable styles when requested.
- If asked to save the current deck style or a supplied image/PDF/PPT/PPTX style, read `docs/style-library.md`.

## Subagent Dispatch

Slide subagents are mandatory after sample approval whenever the runtime can spawn them. The main agent prepares jobs and records state; each worker handles exactly one `prompts/slide_XX.json` job and returns only selected image path, backend, and QA note.

Use `docs/slide-generation-and-subagents.md` for dispatch, commands, result recording, blockers, and backend provenance. Use `prompts/slide-worker.md` as the handoff template.

Subagents must not edit `outline.md`, `deck_spec.json`, other slide jobs, `origin_image/`, `speech.md`, or the final `.pptx`. The parent records outputs and assembles.

## Acceptance Criteria

- Output is a valid `.pptx`.
- Each expected final slide image exists under `origin_image/slide_XX.png`.
- Every final slide image was generated by the confirmed backend and recorded through `record_slide_result.py`, except an approved sample marked accepted by run state.
- `outline.md` reflects the approved deck outline.
- `speech.md` exists when speaker notes are expected, and assembly writes those notes into the PPT.
- `slide_jobs.json` and `slide_run_state.json` reflect the final state.
- Required source images are visibly represented, or a blocker is reported.
- If blocked, the final response identifies phase, slide id, evidence path, and unfinished reason; do not call the deck complete.

## Reference Map

- `docs/workflow-gates-and-progress.md`: approval gates, progress, completion evidence.
- `docs/backend-selection.md`: backend decision rules and confirmation text.
- `docs/outline-style-and-sample.md`: outline, style, sample rules, prompt examples.
- `docs/user-supplied-assets.md`: strict handling for required source assets.
- `docs/slide-generation-and-subagents.md`: jobs, dispatch, result recording, blockers, provenance.
- `docs/cli-api-fallback.md`: fallback runtime, generation/edit commands, image limits, troubleshooting.
- `docs/image-model-configuration.md`: API key, base URL, model, `.env`; read only when config is needed.
- `docs/project-assembly-and-reporting.md`: project directory, notes, assembly, final report, prompting principles.
- `prompts/slide-worker.md`: slide subagent handoff template.
- `references/*.md`: visual style references.

## Documentation and Updates

For source, docs, install, config, and examples, see [ningzimu/codex-ppt-skill](https://github.com/ningzimu/codex-ppt-skill).

Codex PPT

SKILL.md

related skills