Escalatiematrix - meer mensen waarschuwen
Goal: Set up an escalation matrix that, after an incident stays open for a certain time, notifies additional people or a different channel.
When to use escalation
Typical scenario: your service goes down at night, the primary on-call does not respond within 15 minutes - the system should automatically alert the shift lead and after another hour the manager too. Escalation makes sure no outage stays unnoticed.
Step 1: Create an escalation policy
- Go to Dashboard → Escalation (or URL
/dashboard/escalation). - Enter a name (e.g. "Night on-call") and click + New policy.
Step 2: Add steps
A policy is a sequence of steps. Each step has:
- Delay (min) - after how many minutes from incident start the step fires.
0= immediately. - Channel -
EmailorWebhook(Slack, Discord, custom endpoint). - Recipients - comma-separated e-mail addresses, or a webhook URL.
- Note - optional, what the recipient should do.
Example 3-tier policy:
- Step 1: 0 min, email primary on-call
- Step 2: 15 min, email shift lead
- Step 3: 60 min, Slack webhook to #ops-emergency
Step 3: Assign the policy to monitors
- Globally - "Set as default" applies it to all your monitors that don't have their own.
- Per-monitor - when creating/editing a monitor, pick a specific policy.
Acknowledging an incident (ACK)
Every escalation notification contains an ACK link. Clicking it:
- Stops further escalation steps.
- Records who and when acknowledged the incident.
- Suppresses repeat escalation messages for this incident.
The ACK link does not require login - it is signed with a unique token. When the monitor recovers (UP), escalation also stops automatically.
Webhook payload
For Slack / Discord / custom endpoints we send a JSON POST:
{
"incident_id": 47,
"monitor": "API server",
"started_at": "2026-05-28T14:32:00Z",
"step_order": 2,
"ack_url": "https://epulz.io/incident/47/ack?token=..."
}
For Slack just use the incoming webhook URL. For Discord, ?wait=true is not required.
FAQ
What if the incident lasts longer than the last step?
No more escalations are sent. For repeating reminders, add another step with a larger delay (e.g. 240 min).
Does escalation work during quiet hours?
Yes. Escalation uses its own channel (email/webhook) and is not subject to quiet hours, which only affect primary notifications.
Can I test a policy without waiting for a real outage?
Easiest way: create a test monitor with URL https://test-bad-domain.invalid, assign it a policy with a short delay (1 min) and watch the emails come in.