back
loading skill details...
>
Infrastructure Monitoring
Table of Contents
Overview
When to Use
Quick Start
Reference Guides
Best Practices
Overview
Implement comprehensive infrastructure monitoring to track system health, performance metrics, and resource utilization with alerting and visualization across your entire stack.
When to Use
Real-time performance monitoring
Capacity planning and trends
Incident detection and alerting
Service health tracking
Resource utilization analysis
Performance troubleshooting
Compliance and audit trails
Historical data analysis
Quick Start
Minimal working example:
# prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
monitor: "infrastructure-monitor"
environment: "production"
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093
# Rule files
rule_files:
- "alerts.yml"
- "rules.yml"
scrape_configs:
# Prometheus itself
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
// ... (see reference guides for full implementation)
Reference Guides
Detailed implementations in the references/ directory:
Guide
Contents
Prometheus Configuration
Prometheus Configuration
Alert Rules
Alert Rules
Alertmanager Configuration
Alertmanager Configuration
Grafana Dashboard
Grafana Dashboard
Monitoring Deployment
Monitoring Deployment
Best Practices
✅ DO
Follow established patterns and conventions
Write clean, maintainable code
Add appropriate documentation
Test thoroughly before deploying
❌ DON'T
Skip testing or validation
Ignore error handling
Hard-code configuration values
1d:[don't have the plugin yet? install it then click "run inline in claude" again.