back
loading skill details...
Handles LLM-as-judge evaluation workflows on Arize including creating/updating evaluators, running evaluations on spans or experiments, managing tasks,…
Arize Evaluator Skill SPACE — All --space flags and the ARIZE_SPACE env var accept a space name (e.g., my-workspace) or a base64 space ID (e.g., U3BhY2U6...). Find yours with ax spaces list. This skill covers designing, creating, and running LLM-as-judge evaluators on Arize. An evaluator defines the judge; a task is how you run it against real data. Prerequisites Proceed directly with the task — run the ax command you need. Do NOT check versions, env vars, or profiles upfront. If an ax command fails, troubleshoot based on the error: command not found or version error → see references/ax-setup.md 401 Unauthorized / missing API key → run ax profiles show to inspect the current profile. If the profile is missing or the API key is wrong, follow references/ax-profiles.md to create/update it. If the user doesn't have their key, direct them to https://app.arize.com/admin > API Keys Space unknown → run ax spaces list to pick by name, or ask the user LLM provider call fails (missing OPENAI_API_KEY / ANTHROPIC_API_KEY) → run ax ai-integrations list --space SPACE to check for platform-managed credentials. If none exist, ask the user to provide the key or create an integration via the arize-ai-provider-integration skill Security: Never read .env files or search the filesystem for credentials. Use ax profiles for Arize credentials and ax ai-integrations for LLM provider keys. If credentials are not available through these channels, ask the user. CRITICAL — Never fabricate evaluation results: If an evaluation task fails, is cancelled, or produces no scores, report the failure clearly and explain what went wrong. Do NOT perform a "manual evaluation," invent quality scores, estimate percentages, or present any agent-generated analysis as if it came from the Arize evaluation system. Instead suggest: (1) fix the identified issue and retry, (2) try running from the Arize UI, (3) verify integration credentials with ax ai-integrations list, (4) contact support at https://arize.com/support
don't have the plugin yet? install it then click "run inline in claude" again.