🧪 Skills

Monitoring

--- name: Monitoring description: "Set up observability for applications and infrastructure with metrics, logs, traces, and alerts." --- ## Complexity Levels | Level | Tools | Setup Time | Best For

v1.0.0

⭐ —

❤️ 4

⬇️ 2.3k

👁 1

Save 📁 Collect

Share

Description

name: Monitoring description: "Set up observability for applications and infrastructure with metrics, logs, traces, and alerts."

Complexity Levels

Level	Tools	Setup Time	Best For
Minimal	UptimeRobot, Healthchecks.io	15 min	Side projects, MVPs
Standard	Uptime Kuma, Sentry, basic Grafana	1-2 hours	Small teams, startups
Professional	Prometheus, Grafana, Loki, Alertmanager	1-2 days	Production systems
Enterprise	Datadog, New Relic, or full OSS stack	Ongoing	Large-scale operations

The Three Pillars

Pillar	What It Answers	Tools
Metrics	"How is the system performing?"	Prometheus, Grafana, Datadog
Logs	"What happened?"	Loki, ELK, CloudWatch
Traces	"Why is this request slow?"	Jaeger, Tempo, Sentry

Quick Start by Use Case

"I just want to know if it's down" → UptimeRobot (free) or Uptime Kuma (self-hosted). See simple.md.

"I need to debug production errors" → Sentry with your framework SDK. 5-minute setup. See apm.md.

"I want real observability" → Prometheus + Grafana + Loki. See prometheus.md.

"I need to centralize logs" → Loki for simple, ELK for complex queries. See logs.md.

What to Monitor

Applications (RED Method)

Rate — requests per second
Errors — error rate by endpoint
Duration — latency (p50, p95, p99)

Infrastructure (USE Method)

Utilization — CPU, memory, disk usage
Saturation — queue depth, load average
Errors — hardware/system errors

Alerting Principles

Do	Don't
Alert on symptoms (user impact)	Alert on causes (CPU high)
Include runbook link	Require investigation to understand
Set appropriate severity	Make everything P1
Require action	Alert on "interesting" metrics

Alert fatigue kills monitoring. If alerts are ignored, you have no monitoring.

For alert configuration, severities, and on-call setup, see alerting.md.

Cost Comparison

Solution	Monthly Cost (small)	Monthly Cost (medium)
UptimeRobot	Free	$7
Uptime Kuma	$5 (VPS)	$5 (VPS)
Sentry	Free / $26	$80
Grafana Cloud	Free tier	$50+
Datadog	$15/host	$23/host + features
Self-hosted stack	$10-20 (VPS)	$50-100 (VPS)

Common Mistakes

Starting with Prometheus/Grafana when Uptime Kuma would suffice
No alerting (dashboards nobody watches)
Too many alerts (alert fatigue → ignored)
Missing runbooks (alert fires, nobody knows what to do)
Not monitoring from outside (only internal checks)
Storing logs forever (cost explodes)

Reviews (0)

Sign in to write a review.

No reviews yet. Be the first to review!

Comments (0)

Sign in to join the discussion.

No comments yet. Be the first to share your thoughts!

Compatible Platforms

Links

📂 Source Code

Pricing

Free

Related Configs

self-improving-agent

Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Clau...

❤️ 2.0k ⬇️ 218k

Self Improving Agent

Captures learnings, errors, and corrections to enable continuous improvement. And also 50+ models for image generation, video generation, text-to-speech, spe...

❤️ 2.0k ⬇️ 206k

Find Skills

Search, discover, and install skills from the open agent skills ecosystem to extend agent capabilities for specific tasks or domains.

❤️ 814 ⬇️ 199k

Summarize

--- name: summarize description: Summarize URLs or files with the summarize CLI (web, PDFs, images, audio, YouTube). homepage: https://summarize.sh metadata: {"clawdbot":{"emoji":"🧾","requires":{"b

❤️ 609 ⬇️ 160k