Detects common LLM coding agent artifacts across four categories (tests, dead code, abstraction, style) over the project or changed files — using parallel su...
SKILL.md

---
name: review-llm-artifacts
description: Detects common LLM coding agent artifacts across four categories (tests, dead code, abstraction, style) over the project or changed files — using parallel subagents when the agent supports them, otherwise four sequential passes. Scans files changed since main by default; use --all for full-project scan. Triggers on LLM cruft cleanup, agent-generated code review, dead code sweeps, test-quality passes, or when the user asks to scan the whole repo.
disable-model-invocation: true
---

# LLM Artifacts Review

Detect common artifacts left behind by LLM coding agents: over-abstraction, dead code, DRY violations in tests, verbose comments, and defensive overkill.

## Hard gates (sequence)

Advance only when each **pass condition** is objectively true (prevents “review complete” without artifacts):

| Gate | Pass condition |
|------|----------------|
| **G1 — Scope** | File list is non-empty *or* you exit with exactly the Step 1 message; `scope` is set to `all` or `changed`. |
| **G2 — Four categories** | Tests, dead code, abstraction, and style are each reviewed (four parallel subagent runs when supported, or four sequential passes covering the same categories). **Stop** if any category did not complete; do not write JSON or a summary that implies a full pass. |
| **G3 — JSON before summary** | `.beagle/llm-artifacts-review.json` exists and is valid JSON **before** Step 6 markdown. |
| **G4 — Integrity** | Step 7 checks pass before treating the run as complete. |

## Arguments

Parse `$ARGUMENTS` for flags and optional path:

| Flag | Effect |
|------|--------|
| *(default)* | **Changed-files scope** — only files changed since `git merge-base HEAD main` (PR-style scope) |
| `--all` | Full project scan — all matching source files under the target path |
| `--parallel` | Force parallel execution where subagents are supported (default when 4+ files in scope) |
| Path | Root directory to scan (default: current working directory) |

## Step 1: Determine Scope

**A. Changed files only (default):**

Resolve the base ref explicitly and fail loudly if none exists — **do not** wrap the `git merge-base` call in `|| true`, which would silently swallow a missing `main`/`master` ref and report "no files to scan" on repos that only have `origin/main` or use `master`. If no base ref is found, suggest the user pass `--all` instead of silently falling back.

```bash
BASE=$(for ref in main origin/main master origin/master; do
         git rev-parse --verify "$ref" >/dev/null 2>&1 && { echo "$ref"; break; }
       done)
if [ -z "$BASE" ]; then
  echo "error: no main/master ref found (checked main, origin/main, master, origin/master). Pass --all for a full-project scan." >&2
  exit 1
fi
MERGE_BASE=$(git merge-base HEAD "$BASE") || {
  echo "error: git merge-base HEAD $BASE failed." >&2
  exit 1
}
git diff --name-only "$MERGE_BASE..HEAD" | grep -E '\.(py|ts|tsx|js|jsx|go|rs|java|rb|swift|kt)$' || true
```

(The trailing `|| true` on the `grep` is intentional — zero source-file matches is a legitimate empty-scope result, distinct from a failed base-ref resolution.)

**B. Full project (`--all`):**

From `TARGET` (default `.`), list source files and **prune** excluded dependency/build trees so `find` never descends into them. `! -path "*/foo/*"` only filters the output; `find` still walks the tree (minutes of wasted I/O on large `node_modules`, `target`, etc.). Use `-prune` instead:

```bash
find "$TARGET" \
  \( -type d \( \
       -name node_modules -o -name .git -o -name vendor -o -name __pycache__ \
    -o -name .venv        -o -name venv -o -name dist   -o -name build \
    -o -name target       -o -name .next -o -name coverage -o -name .turbo \
  \) -prune \) -o \
  \( -type f \( \
       -name "*.py" -o -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" \
    -o -name "*.go" -o -name "*.rs" -o -name "*.java" -o -name "*.rb" \
    -o -name "*.swift" -o -name "*.kt" \
  \) -print \)
```

**Large repos:** The `--all` path can produce huge file lists. If file count exceeds **400**, warn and suggest narrowing: pass a subdirectory as `TARGET`, or drop `--all` to fall back to the default changed-files scope. Still proceed unless the user explicitly cancels. (This warning does **not** fire on the default changed-files scope, which is already bounded by the PR diff.)

If no files are found, exit with:

`No files to scan. Check the path, branch, or pass --all for a full-project scan.`

Set `scope` in the report: `"all"` for `--all`, `"changed"` for the default changed-files scope.

## Step 2: Detect Languages

Extract unique file extensions from the file list:

```bash
echo "$FILES" | sed 's/.*\.//' | sort -u
```

Map extensions to language names for the report:
- `.py` -> Python
- `.ts`, `.tsx` -> TypeScript
- `.js`, `.jsx` -> JavaScript
- `.go` -> Go
- `.rs` -> Rust
- `.java` -> Java
- `.rb` -> Ruby
- `.swift` -> Swift
- `.kt` -> Kotlin

## Step 3: Review the Four Categories

Cover all four categories below. **If the agent supports subagents** and file count >= 4 (or `--parallel` is set), dispatch one subagent per category in parallel. **Otherwise**, run the four category reviews sequentially yourself, producing the same findings. Either way:

1. Load the [llm-artifacts-detection](../llm-artifacts-detection/SKILL.md) skill
2. Review each category (one per subagent when parallel, one pass at a time when sequential)
3. Collect findings in the structured format below

### Category 1: Tests

**Focus:** Testing anti-patterns from LLM generation

- DRY violations (repeated setup code, duplicate assertions)
- Testing library/framework code instead of application logic
- Wrong mock boundaries (mocking too much or too little)
- Overly verbose test names that describe implementation
- Tests that just mirror the implementation

### Category 2: Dead Code

**Focus:** Unused or obsolete code

- Unused imports, variables, functions, classes
- TODO/FIXME comments that should have been resolved
- Backwards compatibility code for removed features
- Orphaned test files for deleted code
- Commented-out code blocks
- Feature flags that are always on/off

### Category 3: Abstraction

**Focus:** Over-engineering patterns

- Unnecessary abstraction layers (interfaces for single implementations)
- Copy-paste drift (similar code that diverged slightly)
- Over-configuration (configurable things that never change)
- Premature generalization
- Factory/Builder patterns for simple object creation
- Deep inheritance hierarchies

### Category 4: Style

**Focus:** Verbose or defensive patterns

- Verbose comments explaining obvious code
- Defensive overkill (null checks on non-nullable values)
- Unnecessary type hints (dynamic languages with obvious types)
- Overly explicit error messages
- Redundant logging
- Self-documenting code with documentation

## Step 4: Consolidate Findings

**Prerequisite:** **G2** satisfied (all four category reviews finished successfully).

Once all four category reviews have completed (parallel subagents or sequential passes), then:

1. Merge all findings into a single list
2. Assign unique IDs (1, 2, 3...)
3. Group by category for display

**Echo before write (anti-confabulation):** Every finding written to JSON MUST come from a category review's `[FILE:LINE] ISSUE_TITLE` output, not from the branch name, directory, or your own inference. After assigning ids, echo the consolidated table — `id | category | file:line | description` — and confirm each row traces to a specific category result. Do not add findings that no category review reported.

**ID lock:** Ids are contiguous `1..N` with no gaps or duplicates. This `1..N` set is the **locked id set** that downstream skills ([verify-llm-artifacts](../verify-llm-artifacts/SKILL.md), [fix-llm-artifacts](../fix-llm-artifacts/SKILL.md)) bind to 1:1. `summary.total` MUST equal `N`, and `summary.by_category` counts MUST sum to `N`. State the id set before writing JSON.

## Step 5: Write JSON Report

Create `.beagle` directory if it doesn't exist:

```bash
mkdir -p .beagle
```

Write findings to `.beagle/llm-artifacts-review.json`:

```json
{
  "version": "1.0.0",
  "created_at": "2024-01-15T10:30:00Z",
  "git_head": "abc1234",
  "scope": "all" | "changed",
  "target": ".",
  "files_scanned": 42,
  "languages": ["Python", "TypeScript", "Go"],
  "findings": [
    {
      "id": 1,
      "category": "tests" | "dead_code" | "abstraction" | "style",
      "type": "dry_violation" | "unused_import" | "over_abstraction" | "verbose_comment" | "...",
      "file": "src/utils/helper.py",
      "line": 42,
      "description": "Repeated setup code in 5 test functions",
      "suggestion": "Extract to a pytest fixture",
      "risk": "Low" | "Medium" | "High",
      "fix_safety": "Safe" | "Needs review",
      "fix_action": "refactor" | "delete" | "simplify" | "extract"
    }
  ],
  "summary": {
    "total": 15,
    "by_category": {
      "tests": 4,
      "dead_code": 5,
      "abstraction": 3,
      "style": 3
    },
    "by_risk": {
      "High": 2,
      "Medium": 8,
      "Low": 5
    },
    "by_fix_safety": {
      "Safe": 10,
      "Needs review": 5
    }
  }
}
```

## Step 6: Display Summary

**Prerequisite:** **G3** satisfied (JSON on disk and parseable).

```markdown
## LLM Artifacts Review

**Scope:** Changed files since merge-base with main | Entire project under `<path>` (when `--all`)
**Files scanned:** 42
**Languages:** Python, TypeScript, Go

### Findings by Category
...
### Summary Table
...
### Next Steps

- Run the [verify-llm-artifacts](../verify-llm-artifacts/SKILL.md) skill to confirm findings and drop false positives before fixing.
- Run the [fix-llm-artifacts](../fix-llm-artifacts/SKILL.md) skill after verification (or to preview safe-only fixes).
- Review the JSON report at `.beagle/llm-artifacts-review.json`
```

## Step 7: Verification (report integrity)

Before completing, verify the review executed correctly:

1. **JSON validity:** Confirm `.beagle/llm-artifacts-review.json` exists and is parseable
2. **Category coverage:** All 4 category reviews completed without errors (parallel subagents or sequential passes)
3. **Git HEAD captured:** The `git_head` field is non-empty in the report
4. **Staleness check:** If a previous report exists, compare stored `git_head` to current HEAD and warn if different
5. **ID + count integrity:** Finding ids are contiguous `1..N`; `summary.total == N`; `summary.by_category` sums to `N`. A mismatch means a finding was added, dropped, or duplicated — fix before completing.

```bash
python3 -c "import json; json.load(open('.beagle/llm-artifacts-review.json'))" 2>/dev/null && echo "✓ Valid JSON" || echo "✗ Invalid JSON"

python3 - <<'PY'
import json
r = json.load(open('.beagle/llm-artifacts-review.json'))
ids = [x['id'] for x in r['findings']]
n = len(ids)
ok = ids == list(range(1, n + 1)) and r['summary']['total'] == n \
     and sum(r['summary']['by_category'].values()) == n
print("✓ ids 1..N and counts consistent" if ok else f"✗ id/count mismatch: ids={ids} total={r['summary']['total']}")
PY

STORED_HEAD=$(jq -r '.git_head' .beagle/llm-artifacts-review.json 2>/dev/null)
CURRENT_HEAD=$(git rev-parse --short HEAD)
if [ "$STORED_HEAD" != "$CURRENT_HEAD" ]; then
  echo "⚠️ Report was generated on $STORED_HEAD, current HEAD is $CURRENT_HEAD"
fi
```

If any verification fails, report the error and do not proceed.

**Finding-level verification** (precision, not JSON syntax) is a **separate** skill: [verify-llm-artifacts](../verify-llm-artifacts/SKILL.md) — run it before mass deletes or `--fix` on risky items.

## Output Format for Each Finding

```text
[FILE:LINE] **ISSUE_TYPE** (Risk, Fix Safety)
- Description
- Suggestion: Specific fix recommendation
```

## Rules

- Follow **Hard gates** order; do not skip **G3** (JSON before Step 6).
- **Anti-confabulation:** every finding must trace to a category review's `[FILE:LINE]` output (Step 4 echo); never invent findings from the branch name, directory, or inference. See the [review-verification-protocol](../review-verification-protocol/SKILL.md) skill → Anti-confabulation (gate 0).
- Always load the [llm-artifacts-detection](../llm-artifacts-detection/SKILL.md) skill first
- Use parallel subagents (when the agent supports them) for the four category reviews when >= 4 files; otherwise run them sequentially
- Every finding MUST have file:line reference
- Categorize risk honestly (don't inflate or deflate)
- Mark fix safety as "Safe" only if change is mechanical and reversible
- Create `.beagle` directory if needed
- Write JSON report before displaying summary
- Default scope is **changed files since merge-base with main**; pass `--all` for a full-project scan
Review Llm Artifacts

SKILL.md

related skills