Run python examples in auto mode with logging, rerun helpers, and background control.
---
name: examples-auto-run
description: Run python examples in auto mode with logging, rerun helpers, and background control.
---
# examples-auto-run
## What it does
- Runs `uv run examples/run_examples.py` with:
- Optional dependency extras enabled by default:
`litellm`, `any-llm`, `sqlalchemy`, `redis`, `blaxel`, `modal`, `runloop`, and `temporal`.
- `EXAMPLES_INTERACTIVE_MODE=auto` (auto-input/auto-approve).
- Per-example logs under `.tmp/examples-start-logs/`.
- Main summary log path passed via `--main-log` (also under `.tmp/examples-start-logs/`).
- Generates a rerun list of failures at `.tmp/examples-rerun.txt` when `--write-rerun` is set.
- Provides start/stop/status/logs/tail/collect/rerun helpers via `run.sh`.
- Background option keeps the process running with a pidfile; `stop` cleans it up.
## Usage
```bash
# Start (auto mode; interactive included by default)
.agents/skills/examples-auto-run/scripts/run.sh start [extra args to run_examples.py]
# Examples:
.agents/skills/examples-auto-run/scripts/run.sh start --filter basic
.agents/skills/examples-auto-run/scripts/run.sh start --include-server --include-audio
# Check status
.agents/skills/examples-auto-run/scripts/run.sh status
# Stop running job
.agents/skills/examples-auto-run/scripts/run.sh stop
# List logs
.agents/skills/examples-auto-run/scripts/run.sh logs
# Tail latest log (or specify one)
.agents/skills/examples-auto-run/scripts/run.sh tail
.agents/skills/examples-auto-run/scripts/run.sh tail main_20260113-123000.log
# Collect rerun list from a main log (defaults to latest main_*.log)
.agents/skills/examples-auto-run/scripts/run.sh collect
# Rerun only failed entries from rerun file (auto mode)
.agents/skills/examples-auto-run/scripts/run.sh rerun
```
## Codex execution environment
- When Codex invokes `start` or `rerun`, run the command outside the Codex sandbox by default
(`sandbox_permissions=require_escalated`). Many examples start nested sandboxes, browsers, npm
helpers, or cloud/local service processes; running from inside the Codex sandbox can produce
environment-only failures such as `sandbox-exec: sandbox_apply: Operation not permitted`,
Playwright cache permission errors, or npm cache permission errors.
- Use sandboxed execution only when the user explicitly asks for it or when running a narrow dry-run
/ log inspection command that does not execute examples.
## Defaults (overridable via env)
- `EXAMPLES_INTERACTIVE_MODE=auto`
- `EXAMPLES_INCLUDE_INTERACTIVE=1`
- `EXAMPLES_INCLUDE_SERVER=0`
- `EXAMPLES_INCLUDE_AUDIO=0`
- `EXAMPLES_INCLUDE_EXTERNAL=0`
- `EXAMPLES_UV_EXTRAS="litellm any-llm sqlalchemy redis blaxel modal runloop temporal"` (set to an empty string to disable extras)
- Auto-approvals in auto mode: `APPLY_PATCH_AUTO_APPROVE=1`, `SHELL_AUTO_APPROVE=1`, `AUTO_APPROVE_MCP=1`
## Log locations
- Main logs: `.tmp/examples-start-logs/main_*.log`
- Per-example logs (from `run_examples.py`): `.tmp/examples-start-logs/<module_path>.log`
- Rerun list: `.tmp/examples-rerun.txt`
- Stdout logs: `.tmp/examples-start-logs/stdout_*.log`
## Notes
- The runner delegates to `uv run --extra ... examples/run_examples.py`, which already writes per-example logs and supports `--collect`, `--rerun-file`, and `--print-auto-skip`.
- `start` uses `--write-rerun` so failures are captured automatically.
- If `.tmp/examples-rerun.txt` exists and is non-empty, invoking the skill with no args runs `rerun` by default.
## Behavioral validation (Codex/LLM responsibility)
The runner does not perform any automated behavioral validation. After every foreground `start` or `rerun`, **Codex must manually validate** all exit-0 entries:
1. Read the example source (and comments) to infer intended flow, tools used, and expected key outputs.
2. Open the matching per-example log under `.tmp/examples-start-logs/`.
3. Confirm the intended actions/results occurred; flag omissions or divergences.
4. Do this for **all passed examples**, not just a sample.
5. Report immediately after the run with concise citations to the exact log lines that justify the validation.
don't have the plugin yet? install it then click "run inline in claude" again.