back
loading skill details...
Comprehensive Python/FastAPI backend code review with optional parallel agents
---
description: Comprehensive Python/FastAPI backend code review with optional parallel agents
name: review-python
disable-model-invocation: true
---
# Backend Code Review
## Hard gates (sequence)
Advance only when each **pass condition** is objectively satisfied (prevents linter-owned false positives and ungrounded findings):
| Gate | Pass condition |
|------|----------------|
| **G1 — Diff scope** | Step 1 command has been run; the changed `.py` paths are enumerated in writing (list may be empty — if empty, state that explicitly and do not invent Python findings). |
| **G2 — Linters before manual style/type** | For `ruff` and `mypy`: either no project config exists for that tool, **or** it was run on the changed files and you captured pass/fail (exit code or clear tool output). **Do not** add manual style or type findings for rules those tools already enforce when configured. |
| **G3 — Protocol and base skills** | The [review-verification-protocol](../review-verification-protocol/SKILL.md), [python-code-review](../python-code-review/SKILL.md), and [fastapi-code-review](../fastapi-code-review/SKILL.md) skills are loaded before Step 6 substantive review. |
| **G4 — Evidence per issue** | Step 7 checks are satisfied for each reported issue before it appears in the final list (re-read source, search references for “unused”, confirm framework handling for “missing”, verify syntax against current docs). |
| **G5 — Output contract** | Findings use sequential numbering, every issue has `FILE:LINE`, and the **Verdict** follows Step 8 (Critical/Major only block; Minor/Informational do not). |
## Arguments
- `--parallel`: If the agent supports subagents, run a specialized subagent per technology area (otherwise reviews run sequentially with identical output)
- Path: Target directory (default: current working directory)
## Step 1: Identify Changed Files
**Pass (G1):** Capture the command output (or equivalent) as your authoritative changed-`.py` set before Steps 2–3.
```bash
git diff --name-only $(git merge-base HEAD main)..HEAD | grep -E '\.py$'
```
## Step 2: Verify Linter Status
**CRITICAL**: Run project linters BEFORE flagging any style or type issues. **Pass (G2):** You may only proceed to Step 3 after each configured linter has been run on the changed files or you have recorded why it was skipped (missing config).
```bash
# Check if ruff config exists and run it
if [ -f "pyproject.toml" ] || [ -f "ruff.toml" ]; then
ruff check <changed_files>
fi
# Check if mypy config exists and run it
if [ -f "pyproject.toml" ] || [ -f "mypy.ini" ]; then
mypy <changed_files>
fi
```
**Rules:**
- If a linter passes for a specific rule (e.g., line length), DO NOT flag that issue manually
- Linter configuration is authoritative for style rules
- Only flag issues that linters cannot detect (semantic issues, architectural problems)
**Why:** Analysis of 24 review outcomes showed 4 false positives (17%) where reviewers flagged line-length violations that `ruff check` confirmed don't exist. The linter's configuration reflects intentional project decisions.
## Step 3: Detect Technologies
```bash
# Detect Pydantic-AI
grep -r "pydantic_ai\|@agent\.tool\|RunContext" --include="*.py" -l | head -3
# Detect SQLAlchemy
grep -r "from sqlalchemy\|Session\|relationship" --include="*.py" -l | head -3
# Detect Postgres-specific
grep -r "psycopg\|asyncpg\|JSONB\|GIN" --include="*.py" -l | head -3
# Check for test files
git diff --name-only $(git merge-base HEAD main)..HEAD | grep -E 'test.*\.py$'
```
## Step 4: Load Verification Protocol
Load the [review-verification-protocol](../review-verification-protocol/SKILL.md) skill and keep its checklist in mind throughout the review.
## Step 5: Load Skills
Load each applicable skill (read its `SKILL.md`) before reviewing its domain.
**Always load:**
- [python-code-review](../python-code-review/SKILL.md)
- [fastapi-code-review](../fastapi-code-review/SKILL.md)
**Conditionally load based on detection:**
| Condition | Skill |
|-----------|-------|
| Test files changed | [pytest-code-review](../pytest-code-review/SKILL.md) |
| Pydantic-AI detected | [pydantic-ai-common-pitfalls](../../../beagle-ai/skills/pydantic-ai-common-pitfalls/SKILL.md) |
| SQLAlchemy detected | [sqlalchemy-code-review](../sqlalchemy-code-review/SKILL.md) |
| Postgres detected | [postgres-code-review](../postgres-code-review/SKILL.md) |
## Step 6: Review
**If the agent supports subagents**, dispatch one per technology area in parallel; **otherwise** run the same areas sequentially, producing identical output.
**Sequential (default, or when subagents are unavailable):**
1. Load applicable skills
2. Review Python quality issues first
3. Review FastAPI patterns
4. Review detected technology areas
5. Consolidate findings
**Parallel (--parallel flag, when the agent supports subagents):**
1. Detect all technologies upfront
2. Dispatch one subagent per technology area
3. Each subagent loads its skill and reviews its domain
4. Wait for all subagents
5. Consolidate findings
### Before Flagging Optimization or Pattern Issues
1. **Check project conventions** (e.g. AGENTS.md or CLAUDE.md) for documented intentional patterns
2. **Check code comments** around the flagged area for "intentional", "optimization", or "NOTE:"
3. **Trace the code path** before claiming missing coverage or inconsistent handling
4. **Consider framework idioms** - what looks wrong generically may be correct for the framework
**Why:** Analysis showed rejections where reviewers flagged "inconsistent error handling" that was intentional optimization, and "missing test coverage" for code paths that don't exist.
## Step 7: Verify Findings
**Pass (G4):** No issue ships until all bullets below are true for that issue.
Before reporting any issue:
1. Re-read the actual code (not just diff context)
2. For "unused" claims - did you search all references?
3. For "missing" claims - did you check framework/parent handling?
4. For syntax issues - did you verify against current version docs?
5. Remove any findings that are style preferences, not actual issues
## Step 8: Review Convergence
**Pass (G5):** Final markdown matches the Output Format template; verdict line reflects only Critical/Major blockers per scope rules below.
### Single-Pass Completeness
You MUST report ALL issues across ALL categories (style, logic, types, tests, security, performance) in a single review pass. Do not hold back issues for later rounds.
Before submitting findings, ask yourself:
- "If all my recommended fixes are applied, will I find NEW issues in the fixed code?"
- "Am I requesting new code (tests, types, modules) that will itself need review?"
If yes to either: include those anticipated downstream issues NOW, in this review, so the author can address everything at once.
### Scope Rules
- Review ONLY the code in the diff and directly related existing code
- Do NOT request new features, test infrastructure, or architectural changes that didn't exist before the diff
- If test coverage is missing, flag it as ONE Minor issue ("Missing test coverage for X, Y, Z") — do NOT specify implementation details like mock libraries, behaviour extraction, or dependency injection patterns that would introduce substantial new code
- Typespecs, documentation, and naming issues are Minor unless they affect public API contracts
- Do NOT request adding new dependencies (e.g. Mox, testing libraries, linter plugins)
### Fix Complexity Budget
Fixes to existing code should be flagged at their real severity regardless of size.
However, requests for **net-new code that didn't exist before the diff** must be classified as Informational:
- Adding a new dependency (e.g. Mox, a linter plugin)
- Creating entirely new modules, files, or test suites
- Extracting new behaviours, protocols, or abstractions
These are improvement suggestions for the author to consider in future work, not review blockers.
### Iteration Policy
If this is a re-review after fixes were applied:
- ONLY verify that previously flagged issues were addressed correctly
- Do NOT introduce new findings unrelated to the previous review's issues
- Accept Minor/Nice-to-Have issues that weren't fixed — do not re-flag them
- The goal of re-review is VERIFICATION, not discovery
## Output Format
```markdown
## Review Summary
[1-2 sentence overview of findings]
## Issues
### Critical (Blocking)
1. [FILE:LINE] ISSUE_TITLE
- Issue: Description of what's wrong
- Why: Why this matters (bug, type safety, security)
- Fix: Specific recommended fix
### Major (Should Fix)
2. [FILE:LINE] ISSUE_TITLE
- Issue: ...
- Why: ...
- Fix: ...
### Minor (Nice to Have)
N. [FILE:LINE] ISSUE_TITLE
- Issue: ...
- Why: ...
- Fix: ...
### Informational (For Awareness)
N. [FILE:LINE] SUGGESTION_TITLE
- Suggestion: ...
- Rationale: ...
## Good Patterns
- [FILE:LINE] Pattern description (preserve this)
## Verdict
Ready: Yes | No | With fixes 1-N (Critical/Major only; Minor items are acceptable)
Rationale: [1-2 sentences]
```
## Post-Fix Verification
After fixes are applied, run:
```bash
ruff check .
mypy .
pytest
```
All checks must pass before approval.
## Rules
- Load skills BEFORE reviewing (not after)
- Number every issue sequentially (1, 2, 3...)
- Include FILE:LINE for each issue
- Separate Issue/Why/Fix clearly
- Categorize by actual severity
- Run verification after fixes
- Report ALL issues in a single pass — do not hold back findings for later iterations
- Re-reviews verify previous fixes ONLY — no new discovery
- Requests for net-new code (new modules, dependencies, test suites) are Informational, not blocking
- The Verdict ignores Minor and Informational items — only Critical and Major block approval
don't have the plugin yet? install it then click "run inline in claude" again.