Monitoring and observability
Monitor SimpleDeploy in three layers. Each catches a different class of failure.
Layer 1: SimpleDeploy itself
Section titled “Layer 1: SimpleDeploy itself”The server emits structured logs and exposes its own health and metrics endpoints.
Process logs
Section titled “Process logs”SimpleDeploy writes everything to stdout/stderr in JSON. When installed as a systemd service it lands in journald.
journalctl -u simpledeploy -f # tail livejournalctl -u simpledeploy --since "1 hour ago"journalctl -u simpledeploy -p err # errors onlyFor long-term retention, set SystemMaxUse=2G in /etc/systemd/journald.conf or ship to Loki/CloudWatch via journalbeat/vector.
Activity & audit log
Section titled “Activity & audit log”Every config change, deploy outcome, auth event, and system action is recorded in the persistent activity log. View it at System → Audit Log (global) or on the per-app Activity tab.
GET /api/activity?limit=100GET /api/apps/{slug}/activity?limit=50Default retention is 365 days. See Activity & Audit Log for retention configuration and export options.
Health endpoint
Section titled “Health endpoint”GET /api/healthReturns 200 OK if the process is alive and the database is reachable. Use this as the target for external uptime checks.
Layer 2: Caddy access (per-app request stats)
Section titled “Layer 2: Caddy access (per-app request stats)”The embedded Caddy instance counts every HTTP request and stores it in SQLite via the simpledeploy_metrics handler module. Available via:
GET /api/apps/{slug}/request-stats?range=24hReturned data: requests/sec, p50/p95/p99 latency, status code breakdown, top paths. The dashboard renders this on the per-app page.
Layer 3: App-internal metrics
Section titled “Layer 3: App-internal metrics”SimpleDeploy collects container CPU, memory, network, and disk I/O for every container every 10 seconds. Per-app aggregates are queryable via:
GET /api/apps/{slug}/metrics?range=24hGET /api/metrics/system?range=24hFor app-specific business metrics (queue depth, request count, etc.) expose a /metrics endpoint inside your container and scrape with Prometheus or similar. SimpleDeploy does not scrape app-internal endpoints.
External monitoring
Section titled “External monitoring”Uptime check
Section titled “Uptime check”Point UptimeRobot, Better Uptime, Pingdom, or your own cron at:
https://manage.example.com/api/healthFrequency: 1 minute. Notify a different channel than your normal alerts (e.g., SMS, not Slack).
Metrics scrape
Section titled “Metrics scrape”If you run a Prometheus/VictoriaMetrics/InfluxDB stack, scrape the system metrics endpoint on a cron and convert to your line protocol:
# Example: scrape every minute and forward to Influxcurl -s -H "Authorization: Bearer $SD_API_KEY" \ https://manage.example.com/api/metrics/system?range=1m \ | jq -r '...' \ | curl -XPOST "$INFLUX_URL/write" --data-binary @-This gets you long-term retention beyond the built-in tiered rollup.
Alerts
Section titled “Alerts”SimpleDeploy’s alert evaluator (internal/alerts/) runs rules against collected metrics and dispatches webhook notifications. Wire webhooks to PagerDuty, Slack, Discord, or any HTTP endpoint.
Recommended rule set:
| Rule | Threshold | Window |
|---|---|---|
| High CPU per app | >80% | 5 min |
| High memory per app | >85% of limit | 5 min |
| Low host disk | <15% free | any |
| App down | no metrics | 5 min |
| Backup failed | last run failed | immediate |
See Alert webhooks and Alert rules.
Securing the monitoring surface
Section titled “Securing the monitoring surface”The /api/health endpoint is safe to expose publicly. Everything else requires auth.