Threshold detection

更新时间:
复制 MD 格式

You can create Threshold Detection alert rules to monitor specific applications. When an alert rule is triggered, the system notifies your specified contacts or DingTalk groups so you can take appropriate action.

Prerequisites

Before you begin, make sure that you have:

  • An application that reports data to Managed Service for OpenTelemetry. For details, see Integration guide

Configure threshold detection

  1. Log on to the ARMS console.

  2. In the left-side navigation pane, choose Application Monitoring > Application Monitoring Alert Rules.

  3. On the Application Monitoring Alert Rules page, choose Create Alert Rule > Managed Service for OpenTelemetry Alert Rule.

  4. On the Create Alert Rule page, specify an Alert name and set Alert detection type to Threshold Detection.

  5. In the Alert Contact section, select the application, metric type, and filter conditions to monitor.

    Parameter

    Description

    Alert application

    Select the application to monitor. You can select multiple applications or all applications.

    Automatically apply this alert rule to newly created applications

    Specifies whether to automatically apply this alert rule to applications that are integrated later.

    Metric type

    Select the type of metric to monitor. For more information, see Alert rule metrics.

    Note

    The available condition fields for an alert rule and the filter conditions vary based on the selected metric type.

    Filter conditions

    Apply filters to the metric to narrow the monitoring scope.

    Dimension filtering options:

    • Traverse: Evaluates each dimension value individually. The alert notification shows the specific dimension value that triggered the alert.

    • No Dimension: The alert notification shows the sum of all values for this dimension.

    • = : The alert includes only the content for the specified dimension value.

    • !=: The alert includes only the content for dimension values that are not equal to the specified value.

    • Contains: The alert includes only the content for dimension values that contain the specified string.

    • Does not contain: The alert includes only the content for dimension values that do not contain the specified string.

    • Matches regular expression: The alert includes only the content for dimension values that match the specified regular expression.

  6. In the Alert rules section, select an Alert trigger mode and set the Alert conditions.

    Parameter

    Description

    Alert trigger mode

    • Single condition: The alert is triggered when the specified condition is met.

    • Multiple conditions: Select an Alert triggering rule.

      • Meet all of the following rules: The alert is triggered only when all alert conditions are met.

      • Meet one of the following rules: The alert is triggered when any one of the alert conditions is met.

    Alert conditions

    Single condition:

    Set alert rule expressions. You can set thresholds for different severity levels.

    Severity levels range from P4 (lowest) to P1 (highest). You can specify thresholds only for the levels you need.

    Example 1: For the average number of JVM Full GCs in the last 5 minutes, trigger a P4 alert if the value is greater than 1, a P3 alert if greater than 2, a P2 alert if greater than 5, and a P1 alert if greater than 10.

    Example 2: For the average number of JVM Full GCs in the last 5 minutes, trigger a P4 alert if the value is greater than 1.

    Multiple conditions:

    Click Add Condition to set alert rule expressions.

    Example:

    Alert triggering rule: Meet all of the following rules

    Condition 1: In the last 2 minutes, the average error rate of calls is greater than or equal to 5%.

    Condition 2: In the last 2 minutes, the number of calls is greater than or equal to 200.

    In Multiple conditions mode, you must also set the corresponding severity level. Severity levels range from P4 (lowest) to P1 (highest).

    Enter P4 recommended threshold

    You can adjust the threshold by referring to the chart that compares the threshold with the metric. If the rule applies to multiple applications, you can click the image.png icon next to Application to generate a different recommended threshold for each application.

    ARMS uses an intelligent algorithm to recommend a threshold based on historical metric levels. For more information, see Recommended Thresholds.

    Alert Quantity Prediction

    View the estimated number of times the metric will exceed the threshold within the selected time period. Click a specific alert count to query the metric values that triggered an alert at a historical point in time.

    When creating or modifying an alert rule, we recommend using the Alert Quantity Prediction feature. This feature uses an algorithm to analyze historical data and predict the number of alerts for a selected time period. This helps you adjust your thresholds. For more information, see Alert Quantity Prediction.

  7. Configure Alert notification and advanced alert settings.

    Parameter

    Description

    Alert notification

    Simple mode

    • Notification objects: For information about how to create a notification object, see Notification objects.

    • Notification period: Select the time period during which notifications are sent.

    • Repetition policy:

      • If no escalation policy is set, the notification is sent only once while the alert remains unresolved.

      • For unresolved alerts, notifications are resent at this frequency until the alert is resolved.

    Standard mode

    Notification policy:

    • If you do not specify a notification policy, triggering an alert will not send a notification by default. A notification is sent only if the alert matches a separate, pre-configured notification policy.

    • Specify a notification rule to send alerts: When an alert is triggered, ARMS sends notifications by using the specified notification policy. You can select an existing notification policy or create one. For more information, see Notification policies.

    Advanced alert settings

    No data

    Handles abnormal data such as no data, composite metrics, and period-over-period comparisons. When the metric data is absent, you can configure the system to treat the value as 0 or 1 for evaluation, or configure the system not to trigger an alert.

    For more information, see Glossary of alert management.

  8. After you complete the configuration, click Save.

Recommended thresholds

The recommended thresholds feature analyzes historical data for your selected application, interface, and alert metric to recommend a suitable static threshold. It also generates a real-time chart comparing the metric and the threshold to help you adjust the threshold.

Use cases

  • If you frequently receive alerts for a metric but your system is operating normally, your threshold may be too low or unsuitable for certain applications or interfaces. In this case, use the recommended thresholds feature to adjust the threshold for the alert rule or for specific applications and interfaces. ARMS automatically recommends a new threshold based on historical data.

  • When you need to set different thresholds for a metric across many applications and interfaces, use the recommended thresholds feature. This feature uses an intelligent algorithm to quickly set appropriate thresholds for each application or interface, saving you the effort of manual configuration.

How it works

When you click Enter P4 recommended threshold, ARMS retrieves the last three days of historical data for the specified metric of each application and interface. It then uses the N-sigma algorithm to calculate the mean and variance. Assuming your business patterns are stable, the metric should follow a normal distribution. In this case, values that deviate significantly (for example, by three standard deviations) from the mean are rare and may indicate an anomaly. Based on this principle, ARMS suggests a threshold based on the metric's average value and volatility over the last three days.

The P4 alert level has the lowest severity. A recommended P4 threshold indicates a minor anomaly. You can use the P4 recommendation as a baseline to set thresholds for more severe alerts, such as P1, P2, and P3.

Alert quantity prediction

The alert quantity prediction feature analyzes historical data to forecast the number of alerts that would have been triggered within a selected period and shows the exact time each alert would have occurred.

How it works

ARMS analyzes the last 24 hours of metric data to predict how many alerts your proposed threshold would have generated. ARMS also provides details showing the exact times the metric value would have exceeded the threshold. You can use this information to adjust thresholds to better fit your business needs.