Monitoring
Stack: Grafana + Prometheus + Node Exporter + cAdvisor + Uptime Kuma.
Stack Components
| Component |
Role |
Port |
| Grafana |
Dashboard UI |
3000 |
| Prometheus |
Metrics collection & storage |
9090 |
| Node Exporter |
Host-level metrics |
9100 |
| cAdvisor |
Container-level metrics |
8080 |
| Uptime Kuma |
Uptime / availability monitoring |
3001 |
Prometheus Configuration
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'node-exporter'
static_configs:
- targets: ['node-exporter:9100']
- job_name: 'cadvisor'
static_configs:
- targets: ['cadvisor:8080']
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Grafana Dashboards
| Dashboard |
Grafana.com ID |
Purpose |
| Node Exporter Full |
1860 |
Host metrics |
| cAdvisor |
14282 |
Container resource usage |
| Docker & System |
893 |
Docker host overview |
Key Alert Rules
| Alert |
Condition |
Severity |
| High CPU |
> 85% for 5 min |
Warning |
| High RAM |
> 90% for 5 min |
Warning |
| Disk usage |
> 80% on any mount |
Warning |
| Container down |
0 running on expected service |
Critical |
| Host unreachable |
Node Exporter scrape fails |
Critical |
Retention
Prometheus: 30 days (set via --storage.tsdb.retention.time=30d).
docker exec prometheus du -sh /prometheus