Notification policies

更新时间: 2026-04-24 04:15:24

Notification policies define rules for matching, grouping, notifying, and escalating alerts or events, enabling flexible alert dispatch and notification management.

Features

A notification policy is a core configuration in the alert and event processing workflow. It supports the following features:

  • Define matching rules to filter relevant alerts and events.

  • Configure event grouping rules to reduce notification fatigue.

  • Specify notification recipients and time periods.

  • Configure repeat notifications, escalation policies, and recovery methods.

  • Associate action integrations to enable automated processing.

Create a notification policy

  1. Log on to the CloudMonitor 2.0 Console, select the target workspace, and then choose Alert Center > Notification Management > Notification Policies from the left-side navigation pane.

  2. Click Create notification policy to configure the policy.

    • Basic information:

      • Name: The name of the notification policy. Up to 120 characters.

      • Description: A description of the notification policy. Up to 120 characters.

    • Routing rule: A routing rule defines how events are matched and who is notified. Events are evaluated against routing rules in sequential order. The first rule that an event matches is applied, and no further rules are evaluated.

      • Group and Denoise: Before configuring routing rules, you must set the Grouping Fields. Similar events are aggregated by these fields to reduce redundant notifications. You can select up to five grouping fields. In addition to the common grouping fields, you can specify custom fields based on event data. By default, labels._cms_rule_id (alert rule ID) and resource.entity.entity_id (resource entity ID) are used.

      • Routing rule configuration: You can configure multiple routing rules for each notification policy. Click Add Routing Rule to add a new rule.

        • Routing condition: Define the conditions for matching events. You can add multiple conditions.

        • Notification object: Set the recipients for notifications. You can add multiple recipients. For a contact, contact group, or on-call schedule, you must select a specific notification method (phone, SMS, or email). Other recipient types do not require a method selection.

        • Effective time: By default, the policy is effective 24/7. Click Modification time to specify a custom time window.

    • Notification rule (Optional): Configure advanced notification settings, including templates, recovery notifications, auto-recovery, repeat notifications, and action integrations.

      • Notification template settings: Configure a notification template for different notification channels.

      • Recovery notification:

        Option

        Description

        Send recovery notification

        Sends a notification when an alert recovers.

        Do not send recovery notification

        Does not send a notification when an alert recovers (default).

      • Auto-recovery: By default, alerts are configured to automatically recover after 10 minutes (600 seconds).

        Option

        Description

        Alerts do not recover automatically

        If an alert remains unresolved when its event data expires, the alert does not recover automatically.

        Alerts recover automatically after N min/sec

        If an alert remains unresolved when its event data expires, the alert recovers automatically.

      • Repeat notification: By default, repeat notifications are sent every 10 minutes (600 seconds) for unrecovered alerts.

        Option

        Description

        No repeat notification needed

        Sends only one notification for an unrecovered alert.

        Repeat every N min/sec if not recovered

        Periodically sends repeat notifications for an unrecovered alert.

      • Action integration: Configure an action integration to run automatically when an alert is triggered or recovers.

        Parameter

        Description

        Trigger Action on Alert

        Runs the specified action integration when an alert is triggered.

        Trigger Action on Recovery

        Runs the specified action integration when an alert recovers.

      • Escalation policy: Select a pre-configured escalation policy. If an alert is not acknowledged within a specified time, it is automatically escalated to higher-level personnel.

  3. After you complete the configuration, click OK to finish setting up the notification policy.

  4. The notification policy list page displays all created policies with the following information:

    Field

    Description

    Name

    The name of the notification policy.

    Description

    The description of the notification policy.

    Grouping fields

    The grouping fields used to group and denoise events.

    Notification mode

    Displays the trigger method and mute time, such as "Trigger immediately, mute for 5 minutes."

    Number of routing rules

    The number of routing rules in the policy.

    Number of custom notification templates

    The number of custom notification templates configured in the policy.

    Migration source

    The source of the policy (Managed Service for Prometheus, ARMS, or CloudMonitor).

    Last modified

    The time when the policy was last modified.

    Status

    The status of the policy (enabled or disabled).

  5. The following operations are available on the list page:

    • Create Notification Policy: Opens the page for creating a notification policy.

    • Search: Supports fuzzy search by policy name.

    • Filter by Status: Filter policies by enabled or disabled status.

    • Sort: Sort by Last Modified time or Status.

    • Edit: Opens the editor to modify the selected policy's configuration.

    • Delete: Deletes the selected policy. Policies synchronized from ARMS or CloudMonitor cannot be deleted.

    • Enable/Disable Switch: Enables or disables the selected policy.

How event matching works

  1. When an event enters a notification policy, it is first processed for group and denoise based on the configured Grouping Fields.

  2. The system evaluates the event against the routing rules in sequential order.

  3. For each routing rule, the system checks if the event meets the Routing Condition and the Effective Time.

  4. The first routing rule that matches is applied, and no subsequent rules are evaluated.

  5. A notification is sent to the Notification Object specified in the matched rule.

  6. The system handles recovery notifications, auto-recovery, and repeat notifications according to the Notification Rule configuration.

Default configuration

The default configuration for a new notification policy is as follows:

Configuration item

Default value

Grouping fields

labels._cms_rule_id, resource.entity.entity_id

Mute time

300 seconds (5 minutes)

Auto-recovery time

600 seconds (10 minutes)

Repeat notification interval

600 seconds (10 minutes)

Recovery notification

Do not send recovery notification

Effective time

All time

Policy sources

Notification policies can be synchronized from multiple sources:

Source type

Description

Managed Service for Prometheus

Policies synchronized from Managed Service for Prometheus.

ARMS

Policies synchronized from Application Real-Time Monitoring Service (ARMS).

CloudMonitor

Policies created locally in CloudMonitor 2.0.

Note: Policies synchronized from other sources may have functional limitations, such as being undeletable or having settings that cannot be edited.

Best practices

Configure multi-level routing rules

Configure different notification methods for events based on their severity level:

  1. Critical events (CRITICAL):

    • Routing Condition: Severity level equals CRITICAL.

    • Notification Object: On-call personnel (phone + SMS).

    • Effective Time: 24/7.

  2. Warning events (WARNING):

    • Routing Condition: Severity level equals WARNING.

    • Notification Object: Development team (DingTalk group).

    • Effective Time: Weekdays, 9:00 AM–6:00 PM.

  3. Informational events (INFO):

    • Routing Condition: None (catch-all).

    • Notification Object: Operations team email.

    • Effective Time: All time.

Group and denoise recommendations

  • Group by rule: Use labels._cms_rule_id to merge events generated by the same rule.

  • Group by resource: Use resource.entity.entity_id to merge events from the same resource.

  • Combine grouping fields: Use both the rule ID and resource ID for more granular event grouping.

Prevent notification fatigue

  • Configure mute time: Set an appropriate mute time to prevent excessive notifications for a single, ongoing issue.

  • Enable repeat notifications: Periodically remind relevant personnel about unhandled alerts.

  • Configure auto-recovery: Set a reasonable auto-recovery time for transient alerts.

  • Configure an escalation policy: Ensure critical alerts are not missed.

FAQ

What is the matching order for routing rules?

Routing rules are matched sequentially from top to bottom. The first rule that matches is applied. We recommend placing rules with more specific conditions at the top and catch-all rules at the bottom.

Why did a recipient not receive a notification?

Check the following settings:

  1. Verify that the notification policy is enabled.

  2. Ensure the routing conditions correctly match the event.

  3. Confirm that the effective time includes the current time.

  4. Check that the contact information for the recipient is correct.

  5. Make sure a notification method is selected.

How can I avoid a notification storm?

  1. Configure grouping fields to group related events.

  2. Set an appropriate mute time.

  3. Configure the repeat notification interval to avoid frequent notifications.

  4. Use routing conditions to filter out low-priority events.

上一篇: Notification history 下一篇: Escalation and silence policies
阿里云首页 云监控 相关技术圈