Item: eval-audit
Rating: 4.8
Author: Implexa

eval-audit

installs

stars

karma

SkillRank score ↗

4.8/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-06-16

eval-audit inspects llm evaluation pipelines and surfaces ranked problems with remediation steps. connects to observability platforms or local data sources and runs diagnostic checks across six (unnamed) areas.

structure

6.0

trigger phrases

4.0

procedure

5.0

edge cases

3.0

documentation

6.0

strengths

SKILL.md

Eval Audit

Inspect an LLM eval pipeline and produce a prioritized list of problems with concrete next steps.

Overview

Gather eval artifacts: traces, evaluator configs, judge prompts, labeled data, metrics dashboards

Run diagnostic checks across six areas

Produce a findings report ordered by impact, with each finding linking to a fix

Prerequisites

Access to eval artifacts (traces, evaluator configs, judge prompts, labeled data) via an observability MCP server or local files. If none exist, skip to "No Eval Infrastructure."

Connecting to Eval Infrastructure

Check whether the user has an observability MCP server connected (Phoenix, Braintrust, LangSmith, Truesight or similar). If available, use it to pull traces, evaluator definitions, and experiment results. If not, ask for local files: CSVs, JSON trace exports, notebooks, or evaluation scripts.

Diagnostic Checks

don't have the plugin yet? install it then click "run inline in claude" again.

eval-audit

SKILL.md

related skills