Local Model Quantization Router

Recommend local LLM model routes and quantization levels using hardware, privacy, task complexity, context size, and budget constraints. Use for Qwen/Ollama/...

installs

stars

karma

SkillRank score ↗

5.1/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-06-09

local-model-quantization-router recommends quantization levels and model families for local llm deployment based on hardware, privacy, and task constraints. outputs routing decisions and fallback strategies without modifying config or downloading models.

structure

7.0

trigger phrases

4.0

procedure

6.0

edge cases

3.0

documentation

5.0

strengths

SKILL.md

---
name: local-model-quantization-router
description: Recommend local LLM model routes and quantization levels using hardware, privacy, task complexity, context size, and budget constraints. Use for Qwen/Ollama/local endpoint cost-performance decisions.
---

# Local Model Quantization Router

Use this skill to choose between local quantized models and cloud fallback before running OpenClaw workloads.

## Workflow

1. Describe available hardware and task requirements.
2. Run `scripts/local_model_quantization_router.py` with CLI flags or JSON input.
3. Review the recommended model family, quantization level, endpoint, fallback, and risk notes.
4. Use the output as routing evidence for local-first or privacy-first deployments.

## Parameters

- `--task TEXT`: Task summary.
- `--complexity {simple,standard,complex,critical}`.
- `--privacy {low,normal,high,regulated}`.
- `--vram-gb FLOAT`: GPU memory available.
- `--ram-gb FLOAT`: System memory available.
- `--context-tokens INT`: Required context window.
- `--hardware PATH`: Optional JSON with `vram_gb`, `ram_gb`, `cpu_only`.
- `--output PATH`: Optional JSON output path.

## Outputs

- `route`: `local-only`, `local-first`, `hybrid`, or `cloud-required`.
- `model`: Recommended model family.
- `quantization`: Suggested quant level.
- `endpoint`: Suggested local endpoint type.
- `fallback`: Safer fallback when quality or context is insufficient.
- `reasons`: Evidence for the decision.

No model is downloaded and no config file is changed.

don't have the plugin yet? install it then click "run inline in claude" again.

Local Model Quantization Router

SKILL.md

related skills