Skip to content

Monitoring and observability

Monitor SimpleDeploy in three layers. Each catches a different class of failure.

The server emits structured logs and exposes its own health and metrics endpoints.

SimpleDeploy writes everything to stdout/stderr in JSON. When installed as a systemd service it lands in journald.

Terminal window
journalctl -u simpledeploy -f # tail live
journalctl -u simpledeploy --since "1 hour ago"
journalctl -u simpledeploy -p err # errors only

For long-term retention, set SystemMaxUse=2G in /etc/systemd/journald.conf or ship to Loki/CloudWatch via journalbeat/vector.

Every config change, deploy outcome, auth event, and system action is recorded in the persistent activity log. View it at System → Audit Log (global) or on the per-app Activity tab.

GET /api/activity?limit=100
GET /api/apps/{slug}/activity?limit=50

Default retention is 365 days. See Activity & Audit Log for retention configuration and export options.

GET /api/health

Returns 200 OK if the process is alive and the database is reachable. Use this as the target for external uptime checks.

Layer 2: Caddy access (per-app request stats)

Section titled “Layer 2: Caddy access (per-app request stats)”

The embedded Caddy instance counts every HTTP request and stores it in SQLite via the simpledeploy_metrics handler module. Available via:

GET /api/apps/{slug}/request-stats?range=24h

Returned data: requests/sec, p50/p95/p99 latency, status code breakdown, top paths. The dashboard renders this on the per-app page.

SimpleDeploy collects container CPU, memory, network, and disk I/O for every container every 10 seconds. Per-app aggregates are queryable via:

GET /api/apps/{slug}/metrics?range=24h
GET /api/metrics/system?range=24h

For app-specific business metrics (queue depth, request count, etc.) expose a /metrics endpoint inside your container and scrape with Prometheus or similar. SimpleDeploy does not scrape app-internal endpoints.

Point UptimeRobot, Better Uptime, Pingdom, or your own cron at:

https://manage.example.com/api/health

Frequency: 1 minute. Notify a different channel than your normal alerts (e.g., SMS, not Slack).

If you run a Prometheus/VictoriaMetrics/InfluxDB stack, scrape the system metrics endpoint on a cron and convert to your line protocol:

Terminal window
# Example: scrape every minute and forward to Influx
curl -s -H "Authorization: Bearer $SD_API_KEY" \
https://manage.example.com/api/metrics/system?range=1m \
| jq -r '...' \
| curl -XPOST "$INFLUX_URL/write" --data-binary @-

This gets you long-term retention beyond the built-in tiered rollup.

SimpleDeploy’s alert evaluator (internal/alerts/) runs rules against collected metrics and dispatches webhook notifications. Wire webhooks to PagerDuty, Slack, Discord, or any HTTP endpoint.

Recommended rule set:

RuleThresholdWindow
High CPU per app>80%5 min
High memory per app>85% of limit5 min
Low host disk<15% freeany
App downno metrics5 min
Backup failedlast run failedimmediate

See Alert webhooks and Alert rules.

The /api/health endpoint is safe to expose publicly. Everything else requires auth.