Alerts engine
The alerts package (internal/alerts/) evaluates user-defined rules against current metrics and dispatches webhooks when conditions hold for the configured duration.
Rule shape
Section titled “Rule shape”A rule is (metric, op, threshold, duration_seconds, scope):
- metric:
cpu,memory,disk,request_rate,request_error_rate,deploy_failure,backup_failure - op:
>,<,>=,<= - threshold: numeric value (percent, count, etc.)
- duration: condition must hold continuously for this many seconds
- scope: per-app (
app_slug) or global (null)
Rules live in the alert_rules table. Webhooks live in webhooks. Many rules can target the same webhook.
Evaluator loop
Section titled “Evaluator loop”The evaluator runs on a fixed tick (default every 30 seconds). On each tick it:
- Loads active rules and snapshots current metrics from the metrics writer.
- Computes whether the condition holds. If yes, increments an in-memory dwell counter; if no, resets it.
- When dwell crosses the rule’s
duration_seconds, the alert fires once. Subsequent ticks while still firing do not re-fire. - When the condition clears, the rule re-arms.
Firings write to alert_history with a snapshot of the rule fields at firing time, so changes to the rule afterward do not rewrite history.
Webhook dispatch
Section titled “Webhook dispatch”The dispatcher posts JSON to the webhook URL. Payload includes rule id, metric, threshold, observed value, app slug (if any), timestamp, and a human-readable message. See the webhooks guide for the full schema.
Network safety: by default the dispatcher refuses URLs that resolve to private, link-local, or loopback addresses. Set SIMPLEDEPLOY_ALLOW_PRIVATE_WEBHOOKS=1 to opt out (only for trusted internal targets).
Failed dispatches retry with exponential backoff up to a small bound; persistent failures are recorded but not retried indefinitely.
Backup failure alerts
Section titled “Backup failure alerts”Backup runs that fail surface through the same alert pipeline using the backup_failure metric, so a single webhook can receive both metric and operational alerts.