Detect configuration drift across environments (dev, staging, production). Compare config files, environment variables, feature flags, and secrets across dep...
---
name: config-drift-scanner
description: Detect configuration drift across environments (dev, staging, production). Compare config files, environment variables, feature flags, and secrets across deployments to find dangerous inconsistencies.
---
# Config Drift Scanner
Find configuration differences across your environments before they cause incidents. Compare config files, environment variables, feature flags, database settings, and runtime parameters between dev/staging/production โ catching the "works on my machine" problems that slip past code review.
Use when: "compare configs between environments", "find config drift", "why does staging work but prod doesn't", "audit environment parity", "check env vars across deploys", or after an incident caused by config mismatch.
## Commands
### 1. `scan` โ Detect Drift Across Environments
#### Step 1: Discover Configuration Sources
```bash
# Find config files in the project
find . -maxdepth 4 \( \
-name "*.env" -o -name "*.env.*" -o \
-name "*.config.*" -o -name "config.yaml" -o -name "config.json" -o \
-name "*.toml" -o -name "settings.*" -o \
-name "docker-compose*.yml" -o -name "docker-compose*.yaml" -o \
-name "values*.yaml" -o -name "*.tfvars" \
\) -not -path "*/node_modules/*" -not -path "*/.git/*" 2>/dev/null
# Find environment variable references in code
rg "process\.env\.|os\.environ|os\.getenv|ENV\[|System\.getenv" \
--type-not binary -g '!node_modules' -g '!vendor' --stats 2>&1 | tail -5
```
#### Step 2: Extract Config from Each Environment
For Kubernetes:
```bash
# ConfigMaps
for env in dev staging prod; do
kubectl get configmaps -n "$env" -o json | python3 -c "
import json, sys
cms = json.load(sys.stdin)['items']
for cm in cms:
name = cm['metadata']['name']
for k, v in cm.get('data', {}).items():
print(f'{name}.{k}={v}')
" > "/tmp/config-$env.txt"
done
```
For .env files:
```bash
# Compare .env files across environments
for f in .env.development .env.staging .env.production; do
if [[ -f "$f" ]]; then
echo "=== $f ==="
grep -v '^#' "$f" | grep -v '^$' | sort
fi
done
```
For Terraform:
```bash
# Compare tfvars across environments
for f in terraform/environments/*/terraform.tfvars; do
echo "=== $f ==="
cat "$f"
done
```
#### Step 3: Diff and Classify
Compare each key across environments and classify differences:
**๐ด Dangerous Drift (same key, unexpected difference):**
- Database connection strings pointing to wrong environment
- API keys that should be environment-specific but are shared
- Feature flags enabled in prod but disabled in staging (testing gap)
- Timeout values that differ without documented reason
- Memory/CPU limits that are lower in prod than staging
**๐ก Expected Drift (different by design):**
- Database URLs (each env has its own DB)
- API endpoints (staging.api.com vs api.com)
- Log levels (DEBUG in dev, INFO in prod)
- Replica counts (1 in dev, 3 in prod)
**๐ข Missing in Environment (potential problem):**
- Env var exists in prod but not staging โ can't test that code path
- Config key exists in staging but not prod โ forgot to add on deploy
#### Step 4: Generate Report
```markdown
# Configuration Drift Report
## Environments Compared: dev โ staging โ production
## ๐ด Dangerous Drift (3 found)
1. `DATABASE_POOL_SIZE`: dev=5, staging=10, **prod=5**
โ Prod should be โฅ staging. Likely copy-paste from dev config.
2. `FEATURE_NEW_CHECKOUT`: dev=true, staging=true, **prod=false**
โ Feature tested in staging but not enabled in prod. Intentional? If so, document.
3. `REDIS_TIMEOUT_MS`: dev=5000, **staging=500**, prod=5000
โ Staging has 10ร shorter timeout. Will mask timeout bugs in staging tests.
## ๐ก Expected Drift (8 found)
- DATABASE_URL: different per environment โ
- LOG_LEVEL: debug/info/warn โ
- API_BASE_URL: per-environment โ
## ๐ข Missing Variables (2 found)
- `SENTRY_DSN`: exists in prod, missing in staging โ errors not tracked in staging
- `RATE_LIMIT_RPS`: exists in prod (100), missing in dev/staging โ no rate limit testing
## Recommendations
1. Add SENTRY_DSN to staging for error visibility
2. Fix DATABASE_POOL_SIZE in prod (should be โฅ staging value)
3. Document why FEATURE_NEW_CHECKOUT differs between staging and prod
```
### 2. `watch` โ Continuous Drift Monitoring
Set up a CI check that runs on every config change:
```yaml
# GitHub Actions
name: Config Drift Check
on:
push:
paths:
- '**/*.env*'
- '**/config.*'
- '**/values*.yaml'
- '**/*.tfvars'
jobs:
drift:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check for drift
run: |
# Compare all env files and flag dangerous differences
diff <(grep -v '^#' .env.staging | sort) <(grep -v '^#' .env.production | sort) || true
```
### 3. `template` โ Generate Config Parity Checklist
For a new environment setup, generate a checklist of all config keys that need to be set, with:
- Required vs optional
- Expected value ranges
- Whether the value should differ or match other environments
- Where to get the value (secrets manager, team lead, auto-generated)
### 4. `secrets-check` โ Verify Secret Rotation
Cross-reference config with secrets management:
- Which secrets are hardcoded vs pulled from vault/secrets manager?
- When were secrets last rotated?
- Are any secrets shared across environments (bad practice)?
- Are test/dev credentials accidentally in prod config?
don't have the plugin yet? install it then click "run inline in claude" again.