Alert Routing

Alert Rules: Use-Cases and Examples

Index

For the purpose of the following examples, we'll reference a sample incoming alert, like the one below:

{
    "message": "This is a sample incoming incident title",
    "summary": "This is a sample incoming incident summary describing the issue",
    "alert_type": "critical",
    "entity_id": "42",
    "payload":{
        "status": "ACME Payments are failing"
        "team": ["devops", "payment"]
        "project": "kubeprod"
        "severity": "1"
        "release": {
            "version": "2.3.1"
        }
        "module":{
            "name": "payments"
        }
}

Example 1: Modifying Incident Summary

You can modify the incident titles and summaries by using the alert data(message, summary, entity_id, payload) within placeholders like
The incident message is {{ message }}, or The incident summary is {{summary}} or
Incident Created with Status - {{payload.status}} in release {{payload.release.version}}.

Let's say our payload also contains an additional field specifying the URL that caused the issue - "url": "www.acme.com/checkout/payments/issue-url/"

If we want to attach this field to our summary for additional context, we can modify the alert summary by using a rule with Payload Key Search as shown below:

Example 2: Modifying Incident Details - Changing Alert Summary

Example 3: Suppressing Incidents Outside Business Hours

The above issue can also be solved via maintenance windows.

Example 4: Suppressing Recurring Incidents

This Alert Rule would suppress similar recurring incidents that your team has already resolved in the last X seconds. A similar incident according to Zenduty is a previous incident created with the same entity_id as the new incoming alert. This can be applied in cases where the issue has been resolved in Zenduty but the alert source is still sending alerts shortly after a similar incident was resolved.

Example 5: Suppress Low Urgency Incidents

Example 6: Changing Alert Type for Non-Prod Alerts to Warnings

Example 7: Auto-Resolve Incidents

The above alert rule would automatically resolve an incident when receiving an alert with Closed mentioned in the title.

Example 8: Prevent Certain Incidents from Auto-Resolving

If there's an Incident with security mentioned in the title, the above rule will attach the Resolved Alert to the Incident as an Info Alert, prevent auto-resolution and attach a note letting responders know that there's manual investigation required.

Example 9: Sync Alert Entity ID with Integration Source

In the below example, if we have a key metric.name available in the payload JSON path, Zenduty will use its value as the entity id; overriding the default entity id.

Every alert coming from an integration source has an `entity_id`.
An `entity_id` points to a resource or an entity in the source application, for eg: the `entity_id` of an alert from Jira will be its `Ticket ID`.
There are some tools integrating with Zenduty that allow you to use a custom payload for notifications. In such cases, it is possible to sync the Zenduty Alert Entity ID with a user-specified value being sent by the integration, enabling easier association and collation if required.

Example 10: Changing Alert Entity ID of Incoming Alerts

The above rule would collate all incidents with a certain matching phrase into a single incident by changing their entity id to a preset value. This may be useful to collect similar incidents with a single course-of-action fix or repetitive warnings together.

Hashed Entity ID

You can also hash your Entity ID by using the Hash Entity ID action. This allows integrations to accept Entity IDs longer than 32 characters, which are hashed and replace the original entity ID. The original entity ID is still viewable in the alert payload.

Example 11: Create New Incidents for Similar Alerts

The above rule will create a new incident if receiving a similar alert after 30 minutes since the first incident was created and add RECURRING to the title.
Leaving the Change Alert Entity ID field blank assigns the incident a random entity id, allowing you to override collation in such cases and create new Zenduty Incidents for recurring alerts. This particular rule is not recommended generally as it leads to an incremental amount of noisy alerts, however may solve some issues depending on the tooling.

Example 12: Change Alert Message to Add Severity

Example 13: Route to Different Escalation Policy

Example 14: Assign an incident to a specific User for specific Incidents

Example 15: Adding a Responder based on a value inside your payload:

In the above example, we're matching the team associated with the event and adding an additional responder in case the incident is regarding the DevOps team.

We can specify the indice that we'd like to be parsing whenever required.
So, if we want to know the secondary team associated with our event, we can change the above rule by replacing $.team[0] with $.team[1].

Example 16: Assigning Priority:

Example 17: Adding Notes, Assigning Roles and Incident Urgency

The Add Note action is only applied for the first alert that triggered the incident. For any alert that does not directly trigger an incident, the Add Note action will not be applied.

Example 18: Assigning Incident Tags and SLAs

Example 19: Add Tasks to Incident

Example 20: Using Alert Rule Groups to Change Urgency

You can use alert rule groups to build more intricate logical rules for niche use cases like the one above, in which the incident urgency will be set to low, if the project is kubenonprod OR if the project is kubeprod AND module.name is logging.

You can also leverage RBAC with Alert Rules to route incidents to other teams or add global users as responders. Learn More

If you're still wondering whether Alert Rules could solve a particular use-case of yours, feel free to reach out to our team at contact@zenduty.com.

Getting Started

Incidents and Response

Escalations and Schedules

Services and Integrations

Integrations