Notification policies
Notification policies define rules for matching, grouping, notifying, and escalating alerts or events, enabling flexible alert dispatch and notification management.
Features
A notification policy is a core configuration in the alert and event processing workflow. It supports the following features:
-
Define matching rules to filter relevant alerts and events.
-
Configure event grouping rules to reduce notification fatigue.
-
Specify notification recipients and time periods.
-
Configure repeat notifications, escalation policies, and recovery methods.
-
Associate action integrations to enable automated processing.
Create a notification policy
-
Log on to the CloudMonitor 2.0 Console, select the target workspace, and then choose from the left-side navigation pane.
-
Click Create notification policy to configure the policy.
-
Basic information:
-
Name: The name of the notification policy. Up to 120 characters.
-
Description: A description of the notification policy. Up to 120 characters.
-
-
Routing rule: A routing rule defines how events are matched and who is notified. Events are evaluated against routing rules in sequential order. The first rule that an event matches is applied, and no further rules are evaluated.
-
Group and Denoise: Before configuring routing rules, you must set the Grouping Fields. Similar events are aggregated by these fields to reduce redundant notifications. You can select up to five grouping fields. In addition to the common grouping fields, you can specify custom fields based on event data. By default,
labels._cms_rule_id(alert rule ID) andresource.entity.entity_id(resource entity ID) are used. -
Routing rule configuration: You can configure multiple routing rules for each notification policy. Click Add Routing Rule to add a new rule.
-
Routing condition: Define the conditions for matching events. You can add multiple conditions.
-
Notification object: Set the recipients for notifications. You can add multiple recipients. For a contact, contact group, or on-call schedule, you must select a specific notification method (phone, SMS, or email). Other recipient types do not require a method selection.
-
Effective time: By default, the policy is effective 24/7. Click Modification time to specify a custom time window.
-
-
-
Notification rule (Optional): Configure advanced notification settings, including templates, recovery notifications, auto-recovery, repeat notifications, and action integrations.
-
Notification template settings: Configure a notification template for different notification channels.
-
Recovery notification:
Option
Description
Send recovery notification
Sends a notification when an alert recovers.
Do not send recovery notification
Does not send a notification when an alert recovers (default).
-
Auto-recovery: By default, alerts are configured to automatically recover after 10 minutes (600 seconds).
Option
Description
Alerts do not recover automatically
If an alert remains unresolved when its event data expires, the alert does not recover automatically.
Alerts recover automatically after N min/sec
If an alert remains unresolved when its event data expires, the alert recovers automatically.
-
Repeat notification: By default, repeat notifications are sent every 10 minutes (600 seconds) for unrecovered alerts.
Option
Description
No repeat notification needed
Sends only one notification for an unrecovered alert.
Repeat every N min/sec if not recovered
Periodically sends repeat notifications for an unrecovered alert.
-
Action integration: Configure an action integration to run automatically when an alert is triggered or recovers.
Parameter
Description
Trigger Action on Alert
Runs the specified action integration when an alert is triggered.
Trigger Action on Recovery
Runs the specified action integration when an alert recovers.
-
Escalation policy: Select a pre-configured escalation policy. If an alert is not acknowledged within a specified time, it is automatically escalated to higher-level personnel.
-
-
-
After you complete the configuration, click OK to finish setting up the notification policy.
-
The notification policy list page displays all created policies with the following information:
Field
Description
Name
The name of the notification policy.
Description
The description of the notification policy.
Grouping fields
The grouping fields used to group and denoise events.
Notification mode
Displays the trigger method and mute time, such as "Trigger immediately, mute for 5 minutes."
Number of routing rules
The number of routing rules in the policy.
Number of custom notification templates
The number of custom notification templates configured in the policy.
Migration source
The source of the policy (Managed Service for Prometheus, ARMS, or CloudMonitor).
Last modified
The time when the policy was last modified.
Status
The status of the policy (enabled or disabled).
-
The following operations are available on the list page:
-
Create Notification Policy: Opens the page for creating a notification policy.
-
Search: Supports fuzzy search by policy name.
-
Filter by Status: Filter policies by enabled or disabled status.
-
Sort: Sort by Last Modified time or Status.
-
Edit: Opens the editor to modify the selected policy's configuration.
-
Delete: Deletes the selected policy. Policies synchronized from ARMS or CloudMonitor cannot be deleted.
-
Enable/Disable Switch: Enables or disables the selected policy.
-
How event matching works
-
When an event enters a notification policy, it is first processed for group and denoise based on the configured Grouping Fields.
-
The system evaluates the event against the routing rules in sequential order.
-
For each routing rule, the system checks if the event meets the Routing Condition and the Effective Time.
-
The first routing rule that matches is applied, and no subsequent rules are evaluated.
-
A notification is sent to the Notification Object specified in the matched rule.
-
The system handles recovery notifications, auto-recovery, and repeat notifications according to the Notification Rule configuration.
Default configuration
The default configuration for a new notification policy is as follows:
|
Configuration item |
Default value |
|
Grouping fields |
|
|
Mute time |
300 seconds (5 minutes) |
|
Auto-recovery time |
600 seconds (10 minutes) |
|
Repeat notification interval |
600 seconds (10 minutes) |
|
Recovery notification |
Do not send recovery notification |
|
Effective time |
All time |
Policy sources
Notification policies can be synchronized from multiple sources:
|
Source type |
Description |
|
Managed Service for Prometheus |
Policies synchronized from Managed Service for Prometheus. |
|
ARMS |
Policies synchronized from Application Real-Time Monitoring Service (ARMS). |
|
CloudMonitor |
Policies created locally in CloudMonitor 2.0. |
Note: Policies synchronized from other sources may have functional limitations, such as being undeletable or having settings that cannot be edited.
Best practices
Configure multi-level routing rules
Configure different notification methods for events based on their severity level:
-
Critical events (CRITICAL):
-
Routing Condition: Severity level equals CRITICAL.
-
Notification Object: On-call personnel (phone + SMS).
-
Effective Time: 24/7.
-
-
Warning events (WARNING):
-
Routing Condition: Severity level equals WARNING.
-
Notification Object: Development team (DingTalk group).
-
Effective Time: Weekdays, 9:00 AM–6:00 PM.
-
-
Informational events (INFO):
-
Routing Condition: None (catch-all).
-
Notification Object: Operations team email.
-
Effective Time: All time.
-
Group and denoise recommendations
-
Group by rule: Use
labels._cms_rule_idto merge events generated by the same rule. -
Group by resource: Use
resource.entity.entity_idto merge events from the same resource. -
Combine grouping fields: Use both the rule ID and resource ID for more granular event grouping.
Prevent notification fatigue
-
Configure mute time: Set an appropriate mute time to prevent excessive notifications for a single, ongoing issue.
-
Enable repeat notifications: Periodically remind relevant personnel about unhandled alerts.
-
Configure auto-recovery: Set a reasonable auto-recovery time for transient alerts.
-
Configure an escalation policy: Ensure critical alerts are not missed.
FAQ
What is the matching order for routing rules?
Routing rules are matched sequentially from top to bottom. The first rule that matches is applied. We recommend placing rules with more specific conditions at the top and catch-all rules at the bottom.
Why did a recipient not receive a notification?
Check the following settings:
-
Verify that the notification policy is enabled.
-
Ensure the routing conditions correctly match the event.
-
Confirm that the effective time includes the current time.
-
Check that the contact information for the recipient is correct.
-
Make sure a notification method is selected.
How can I avoid a notification storm?
-
Configure grouping fields to group related events.
-
Set an appropriate mute time.
-
Configure the repeat notification interval to avoid frequent notifications.
-
Use routing conditions to filter out low-priority events.