Audit and quality-review AI agent skills (SKILL.md files) against established best practices from the Nevo Systems skill-writing framework. Checks trigger qu...

SKILL.md

---
name: skill-quality-audit
description: >
  Audit and quality-review AI agent skills (SKILL.md files) against
  established best practices from the Nevo Systems skill-writing
  framework. Checks trigger quality, structure (rigid vs flexible),
  conciseness, progressive disclosure, and common mistakes.
  Use when reviewing, auditing, or evaluating an existing skill,
  when a skill isn't performing well, when the user says
  "review this skill", "audit my skill", "check this SKILL.md",
  "skill quality check", or "/skill-review".
---

# Skill Review

Audit a SKILL.md file against the following quality dimensions.
Report findings with severity (critical / warning / nit) and
actionable fix suggestions.

## Review Dimensions

Evaluate in order. Stop at critical findings — fix those first.

### 1. Trigger Quality

The `description` field in YAML frontmatter is the ONLY trigger mechanism.
It must answer three questions clearly:

- **What** does this skill do? (1-2 sentences)
- **When** should the agent use it? (contexts, task types)
- **What trigger phrases** activate it? (exact words users might say)

Red flags:

- [ ] Description is vague (<2 sentences or generic like "Helps with X")
- [ ] Trigger phrases are missing or too broad
- [ ] Trigger conditions are in the body, not the frontmatter
- [ ] Skill would trigger when irrelevant (overly broad scope)

### 2. Structure: Rigid vs Flexible

Determine if the skill matches the right structural approach:

| Approach    | Use When                          | Signs of Mismatch              |
|-------------|-----------------------------------|--------------------------------|
| **Rigid**   | Fragile task, consistency critical, skipping steps = failure | Loose guidelines for DB migration |
| **Flexible** | Multiple valid approaches, requires judgment | Overly prescriptive for code review |

Decision rule: **If the agent deviates from instructions, how much damage?**
- High damage → Must be rigid (numbered steps, checklists, exact commands)
- Low damage → Flexible OK (guidelines, heuristics, decision frameworks)

### 3. Conciseness

- [ ] Body is under 500 lines
- [ ] No explanations of things the model already knows (e.g., "write clear English", "use proper indentation")
- [ ] No conceptual documentation — instructions only (not "X is a system that...", but "Run X with these flags")
- [ ] Each paragraph justifies its token cost

### 4. Progressive Disclosure

Check the three-level loading strategy:

- **Level 1** (always in context): Frontmatter name + description — under ~100 words
- **Level 2** (on trigger): SKILL.md body — under 5,000 words, self-contained enough to start work
- **Level 3** (on demand): References/ — loaded only when needed

Red flags:

- [ ] SKILL.md body contains detailed API docs, schemas, or variant-specific content → move to `references/`
- [ ] Multiple variants/frameworks live in body instead of separate reference files
- [ ] No references/ directory exists but body exceeds 500 lines
- [ ] Reference files not explicitly mentioned in SKILL.md body

### 5. Common Mistakes Checklist

- [ ] **Over-explaining the obvious**: Instructions like "write grammatically correct English" or "use proper code formatting". The model already does this. **Remove.**
- [ ] **Trigger info in body**: "When to Use This Skill" section in the body is useless — only frontmatter matters for triggering. **Move to description or delete.**
- [ ] **Documentation instead of instructions**: Explaining what something is instead of telling the agent what to do. Replace "Database migrations are schema changes..." with "Back up the database before migration."
- [ ] **Too many tiny skills**: If two skills always trigger together, merge them. If a skill is under 20 lines after removing fluff, consider if it's too granular.
- [ ] **Context budget ignored**: Skill consumes excessive tokens → use progressive disclosure.
- [ ] **Bad naming**: `skill1`, `my-skill`, `helper`, `utils` → use descriptive, verb-led hyphen-case names.

### 6. Bundled Resources Check

- `scripts/`: Are they tested? Do they solve a repeatedly-rewritten problem?
- `references/`: Are they referenced from SKILL.md with clear "when to read" guidance?
- `assets/`: Are they output-oriented (templates, images), not documentation?
- No extraneous files (README.md, CHANGELOG.md, INSTALLATION_GUIDE.md, etc.)

### 7. Trigger Testing (if applicable)

If a test plan exists or can be inferred:

- List 3 queries that SHOULD trigger this skill
- List 3 queries that SHOULD NOT trigger this skill
- Flag any overlap or ambiguity

## Output Format

```markdown
## Skill Review: [skill-name]

### Summary
[1-2 sentences: overall quality, primary issues]

### Findings

| # | Severity | Dimension | Issue | Fix |
|---|----------|-----------|-------|-----|
| 1 | critical | Trigger | ... | ... |

### Trigger Test
- ✅ Should trigger: ...
- ❌ Should NOT trigger: ...

### Score
- Trigger: X/10
- Structure: X/10
- Conciseness: X/10
- Progressive Disclosure: X/10
- Overall: X/10
```

Skill Quality Audit

SKILL.md

related skills