Run a free 35B AI coding agent on Apple Silicon Macs using local LLMs via llama.cpp or MLX with web search, shell, and file tools.
mac-code — Free Local AI Agent on Apple Silicon Skill by ara.so — Daily 2026 Skills collection. Run a 35B reasoning model locally on your Mac for $0/month. mac-code is a CLI AI coding agent (Claude Code alternative) that routes tasks — web search, shell commands, file edits, chat — through a local LLM. Supports llama.cpp (30 tok/s) and MLX (64K context, persistent KV cache) backends on Apple Silicon. What It Does LLM-as-router: The model classifies every prompt as search, shell, or chat and routes accordingly 35B MoE at 30 tok/s via llama.cpp + IQ2_M quantization (fits in 16 GB RAM) 35B full Q4 on 16 GB via custom MoE Expert Sniper (1.54 tok/s, only 1.42 GB RAM used) 9B at 64K context via quantized KV cache (q4_0 keys/values) MLX backend adds persistent KV cache save/load, context compression, R2 sync Tools: DuckDuckGo search, shell execution, file read/write
don't have the plugin yet? install it then click "run inline in claude" again.