Intelligent model selection middleware for AI agents. Route tasks to the best model, manage context, and cut API costs 40-70%.
---
name: model-routing-middleware
version: 1.0.0
description: "Intelligent model selection middleware for AI agents. Route tasks to the best model, manage context, and cut API costs 40-70%."
license: MIT
tags: [model-routing, llm, cost-optimization, middleware, agent, context-management, escalation]
source: el-rudo-larios/model-routing
trigger: "model routing LLM selection cost optimization middleware agent"
metadata:
openclaw:
emoji: "π§"
---
# Model Routing β Intelligent Model Selection Middleware
Automatically select the best LLM model and think mode based on task type, context size, and response confidence. Cut API costs 40-70% by routing simple tasks to fast models and complex tasks to capable ones.
## How It Works
```
User message β Task Classifier β Model Router β Best Model β Response
β
Low confidence? β Escalate to stronger model
```
## Quick Start
```yaml
# config.yaml
models:
casual_chat:
model: qwen3-14b
think: false
coding:
model: qwen-coder
think: true
reasoning:
model: deepseek-r1
think: true
long_context:
model: glm-5.1
think: false
```
```python
from router import get_router
router = get_router()
result = await router.route("Write a Python web scraper")
# β Routes to qwen-coder with think=True
```
## Features
- Task-type classification (coding, reasoning, chat, summarization)
- Per-model think mode configuration
- Confidence-based escalation (retry with stronger model)
- Context management and compaction at 55% threshold
- Hot-reload configuration (no restart needed)
- 83 tests passing
## Cost Savings
| Task Type | Without Routing | With Routing | Savings |
|-----------|----------------|--------------|---------|
| Casual chat | GPT-4 ($0.03/1K) | Qwen3-14B (local) | ~100% |
| Coding | GPT-4 ($0.03/1K) | Qwen-Coder (local) | ~95% |
| Hard reasoning | GPT-4 ($0.03/1K) | DeepSeek-R1 (local) | ~90% |
## License
MIT
don't have the plugin yet? install it then click "run inline in claude" again.