Install and configure Ollama for local embeddings with GrepAI. Use this skill when setting up private, local embedding generation.
Ollama Setup for GrepAI
This skill covers installing and configuring Ollama as the local embedding provider for GrepAI. Ollama enables 100% private code search where your code never leaves your machine.
When to Use This Skill
Setting up GrepAI with local, private embeddings
Installing Ollama for the first time
Choosing and downloading embedding models
Troubleshooting Ollama connection issues
Why Ollama?
Benefit
Description
๐ Privacy
Code never leaves your machine
๐ฐ Free
No API costs
โก Fast
Local processing, no network latency
๐ Offline
Works without internet
Installation
macOS (Homebrew)
# Install Ollama
brew install ollama
# Start the Ollama service
ollama serve
macOS (Direct Download)
Download from ollama.com
Open the .dmg and drag to Applications
Launch Ollama from Applications
Linux
# One-line installer
curl -fsSL https://ollama.com/install.sh | sh
# Start the service
ollama serve
Windows
Download installer from ollama.com
Run the installer
Ollama starts automatically as a service
Downloading Embedding Models
GrepAI requires an embedding model to convert code into vectors.
Recommended Model: nomic-embed-text
# Download the recommended model (768 dimensions)
ollama pull nomic-embed-text
Specifications:
Dimensions: 768
Size: ~274 MB
Performance: Excellent for code search
Language: English-optimized
Alternative Models
# Multilingual support (better for non-English code/comments)
ollama pull nomic-embed-text-v2-moe
# Larger, more accurate
ollama pull bge-m3
# Maximum quality
ollama pull mxbai-embed-large
Model
Dimensions
Size
Best For
nomic-embed-text
768
274 MB
General code search
nomic-embed-text-v2-moe
768
500 MB
Multilingual codebases
bge-m3
1024
1.2 GB
Large codebases
mxbai-embed-large
1024
670 MB
Maximum accuracy
Verifying Installation
Check Ollama is Running
# Check if Ollama server is responding
curl http://localhost:11434/api/tags
# Expected output: JSON with available models
List Downloaded Models
ollama list
# Output:
# NAME ID SIZE MODIFIED
# nomic-embed-text:latest abc123... 274 MB 2 hours ago
Test Embedding Generation
# Quick test (should return embedding vector)
curl http://localhost:11434/api/embeddings -d '{
"model": "nomic-embed-text",
"prompt": "function hello() { return world; }"
}'
Configuring GrepAI for Ollama
After installing Ollama, configure GrepAI to use it:
# .grepai/config.yaml
embedder:
provider: ollama
model: nomic-embed-text
endpoint: http://localhost:11434
This is the default configuration when you run grepai init, so no changes are needed if using nomic-embed-text.
Running Ollama
Foreground (Development)
# Run in current terminal (see logs)
ollama serve
Background (macOS/Linux)
# Using nohup
nohup ollama serve &
# Or as a systemd service (Linux)
sudo systemctl enable ollama
sudo systemctl start ollama
Check Status
# Check if running
pgrep -f ollama
# Or test the API
curl -s http://localhost:11434/api/tags | head -1
Resource Considerations
Memory Usage
Embedding models load into RAM:
nomic-embed-text: ~500 MB RAM
bge-m3: ~1.5 GB RAM
mxbai-embed-large: ~1 GB RAM
CPU vs GPU
Ollama uses CPU by default. For faster embeddings:
macOS: Uses Metal (Apple Silicon) automatically
Linux/Windows: Install CUDA for NVIDIA GPU support
Common Issues
โ Problem: connection refused to localhost:11434
โ
Solution: Start Ollama:
ollama serve
โ Problem: Model not found
โ
Solution: Pull the model first:
ollama pull nomic-embed-text
โ Problem: Slow embedding generation
โ
Solution:
Use a smaller model
Ensure Ollama is using GPU (check ollama ps)
Close other memory-intensive applications
โ Problem: Out of memory
โ
Solution: Use a smaller model or increase system RAM
Best Practices
Start Ollama before GrepAI: Ensure ollama serve is running
Use recommended model: nomic-embed-text offers best balance
Keep Ollama running: Leave it as a background service
Update periodically: ollama pull nomic-embed-text for updates
Output Format
After successful setup:
โ
Ollama Setup Complete
Ollama Version: 0.1.x
Endpoint: http://localhost:11434
Model: nomic-embed-text (768 dimensions)
Status: Running
GrepAI is ready to use with local embeddings.
Your code will never leave your machine.don't have the plugin yet? install it then click "run inline in claude" again.