Skip to content

Monitoring

Stack: Grafana + Prometheus + Node Exporter + cAdvisor + Uptime Kuma.


Stack Components

Component Role Port
Grafana Dashboard UI 3000
Prometheus Metrics collection & storage 9090
Node Exporter Host-level metrics 9100
cAdvisor Container-level metrics 8080
Uptime Kuma Uptime / availability monitoring 3001

Prometheus Configuration

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']
  - job_name: 'cadvisor'
    static_configs:
      - targets: ['cadvisor:8080']
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

Grafana Dashboards

Dashboard Grafana.com ID Purpose
Node Exporter Full 1860 Host metrics
cAdvisor 14282 Container resource usage
Docker & System 893 Docker host overview

Key Alert Rules

Alert Condition Severity
High CPU > 85% for 5 min Warning
High RAM > 90% for 5 min Warning
Disk usage > 80% on any mount Warning
Container down 0 running on expected service Critical
Host unreachable Node Exporter scrape fails Critical

Retention

Prometheus: 30 days (set via --storage.tsdb.retention.time=30d).

docker exec prometheus du -sh /prometheus