back
loading skill details...
>
Prometheus Monitoring
Table of Contents
Overview
When to Use
Quick Start
Reference Guides
Best Practices
Overview
Implement comprehensive Prometheus monitoring infrastructure for collecting, storing, and querying time-series metrics from applications and infrastructure.
When to Use
Setting up metrics collection
Creating custom application metrics
Configuring scraping targets
Implementing service discovery
Building monitoring infrastructure
Quick Start
Minimal working example:
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: production
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
rule_files:
- "/etc/prometheus/alert_rules.yml"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
- job_name: "api-service"
// ... (see reference guides for full implementation)
Reference Guides
Detailed implementations in the references/ directory:
Guide
Contents
Prometheus Configuration
Prometheus Configuration
Node.js Metrics Implementation
Node.js Metrics Implementation
Python Prometheus Integration
Python Prometheus Integration
Alert Rules
Alert Rules
Docker Compose Setup
Docker Compose Setup
Best Practices
✅ DO
Use consistent metric naming conventions
Add comprehensive labels for filtering
Set appropriate scrape intervals (10-60s)
Implement retention policies
Monitor Prometheus itself
Test alert rules before deployment
Document metric meanings
❌ DON'T
Add unbounded cardinality labels
Scrape too frequently (< 10s)
Ignore metric naming conventions
Create alerts without runbooks
Store raw event data in Prometheus
Use counters for gauge-like values
1d:[don't have the plugin yet? install it then click "run inline in claude" again.