>
Eval Audit Inspect an LLM eval pipeline and produce a prioritized list of problems with concrete next steps. Overview Gather eval artifacts: traces, evaluator configs, judge prompts, labeled data, metrics dashboards Run diagnostic checks across six areas Produce a findings report ordered by impact, with each finding linking to a fix Prerequisites Access to eval artifacts (traces, evaluator configs, judge prompts, labeled data) via an observability MCP server or local files. If none exist, skip to "No Eval Infrastructure." Connecting to Eval Infrastructure Check whether the user has an observability MCP server connected (Phoenix, Braintrust, LangSmith, Truesight or similar). If available, use it to pull traces, evaluator definitions, and experiment results. If not, ask for local files: CSVs, JSON trace exports, notebooks, or evaluation scripts. Diagnostic Checks
don't have the plugin yet? install it then click "run inline in claude" again.