Configures a threshold alert rule.
Operation description
This topic provides an example on how to configure a threshold alert rule for the cpu_total metric of the Elastic Computing Service (ECS) instance i-uf6j91r34rnwawoo**** in the acs_ecs_dashboard namespace. The alert contact group is ECS_Group, the alert rule name is test123, the alert rule ID is a151cd6023eacee2f0978e03863cc1697c89508****, the statistical method for the Critical level is Average, the comparison operator for the Critical level is GreaterThanOrEqualToThreshold, the threshold for the Critical level is 90, and the retry count for the Critical level is 3.
As of August 15, 2024, Statistics validation is increased. The statistical value must match the Statistics of the corresponding metric. For information about how to obtain the value of this parameter, see Alibaba Cloud service monitoring metrics.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
cms:PutResourceMetricRule |
create |
*All Resource
|
None | None |
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| RuleId |
string |
Yes |
The ID of the alert rule. You can enter a new alert rule ID or use the ID of an existing alert rule in CloudMonitor. For information about how to query alert rule IDs, see DescribeMetricRuleList. Note
If you enter a new alert rule ID, a threshold alert rule is created. |
a151cd6023eacee2f0978e03863cc1697c89508**** |
| RuleName |
string |
Yes |
The name of the alert rule. You can enter a new alert rule name or use the name of an existing alert rule in CloudMonitor. For information about how to query alert rule names, see DescribeMetricRuleList. Note
If you enter a new alert rule name, a threshold alert rule is created. |
test123 |
| Namespace |
string |
Yes |
The namespace of the Alibaba Cloud service. For information about how to query the namespace of an Alibaba Cloud service, see Alibaba Cloud service monitoring metrics. Note
If you create a Prometheus alert rule for Hybrid Cloud Monitoring, set this parameter to |
acs_ecs_dashboard |
| MetricName |
string |
Yes |
The name of the metric. For information about how to query metric names, see Alibaba Cloud service monitoring metrics. Note
If you create a Prometheus alert rule for Hybrid Cloud Monitoring, this parameter specifies the name of the metric repository. For information about how to obtain the metric repository name, see DescribeHybridMonitorNamespaceList. |
cpu_total |
| Resources |
string |
No |
The resource information, such as For information about the supported monitoring dimensions, see Alibaba Cloud service monitoring metrics. |
[{"instanceId":"i-uf6j91r34rnwawoo****"}] |
| ContactGroups |
string |
Yes |
The alert contact group. Alert notifications are sent to the alert contacts in this alert contact group. Note
An alert contact group contains one or more alert contacts. For information about how to create alert contacts and alert contact groups, see PutContact and PutContactGroup. |
ECS_Group |
| Webhook |
string |
No |
The callback URL to which a POST request is sent when an alert is triggered. |
https://alert.aliyun.com.com:8080/callback |
| EffectiveInterval |
string |
No |
The effective period of the alert rule. |
00:00-23:59 |
| NoEffectiveInterval |
string |
No |
The time range during which the alert rule is ineffective. |
00:00-06:00 |
| SilenceTime |
integer |
No |
The mute period. Unit: seconds. Default value: 86400. Note
The mute period specifies the interval at which an alert notification is re-sent if the alert does not recover to Normal. |
86400 |
| Period |
string |
No |
The statistical period of the metric. Unit: seconds. The default value is the original reporting period of the metric. Note
For information about how to query the statistical period of a metric, see Alibaba Cloud service monitoring metrics. |
60 |
| Interval |
string |
No |
The trigger period of the alert rule. Unit: seconds. Note
For information about how to query the statistical period of a metric, see Alibaba Cloud service monitoring metrics. |
60 |
| EmailSubject |
string |
No |
The subject of the alert email. |
ECS instance alert |
| Escalations.Critical.Statistics |
string |
No |
Critical 级别报警统计方法。 该参数的取值由指定云产品的 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
Average |
| Escalations.Critical.ComparisonOperator |
string |
No |
Critical 级别阈值比较符。取值:
Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
GreaterThanOrEqualToThreshold |
| Escalations.Critical.Threshold |
string |
No |
Critical 级别报警阈值。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
90 |
| Escalations.Critical.Times |
integer |
No |
Critical 级别报警重试次数。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
3 |
| Escalations.Warn.Statistics |
string |
No |
Warn 级别报警统计方法。 该参数的取值由指定云产品的 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
Average |
| Escalations.Warn.ComparisonOperator |
string |
No |
Warn 级别阈值比较符。取值:
Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
GreaterThanOrEqualToThreshold |
| Escalations.Warn.Threshold |
string |
No |
Warn 级别报警阈值。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
90 |
| Escalations.Warn.Times |
integer |
No |
Warn 级别报警重试次数。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
3 |
| Escalations.Info.Statistics |
string |
No |
Info 级别报警统计方法。 该参数的取值由指定云产品的 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
Average |
| Escalations.Info.ComparisonOperator |
string |
No |
Info 级别阈值比较符。取值:
Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
GreaterThanOrEqualToThreshold |
| Escalations.Info.Threshold |
string |
No |
Info 级别报警阈值。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
90 |
| Escalations.Info.Times |
integer |
No |
Info 级别报警重试次数。 Note
报警级别 Critical(严重)、Warn(警告)或 Info(信息)至少设置一个,且该报警级别中的参数 Statistics、ComparisonOperator、Threshold 和 Times 必须同时设置。 |
3 |
| NoDataPolicy |
string |
No |
The processing method when no monitoring data is found. Valid values:
|
KEEP_LAST_STATE |
| CompositeExpression |
object |
No |
The alert conditions for multiple metrics. Note
Single-metric and multi-metric alert conditions are mutually exclusive and cannot be set at the same time. |
|
| ExpressionList |
array<object> |
No |
The list of alert conditions created in standard mode. |
|
|
object |
No |
None. |
||
| MetricName |
string |
No |
The metric name of the Alibaba Cloud service. |
cpu_total |
| Period |
integer |
No |
The aggregation period of the metric. Unit: seconds. |
60 |
| Statistics |
string |
No |
The statistical method of the metric. Valid values:
Note
|
$Maximum |
| ComparisonOperator |
string |
No |
The comparison operator for the threshold. Valid values:
|
GreaterThanOrEqualToThreshold |
| Threshold |
string |
No |
The alert threshold. |
90 |
| ExpressionListJoin |
string |
No |
The relationship between multi-metric alert conditions. Valid values:
|
|| |
| ExpressionRaw |
string |
No |
The alert condition created by using an expression. The following scenarios are supported:
|
$Average > ($instanceId == 'i-io8kfvcpp7x5****'? 80: 50) |
| Level |
string |
No |
The alert level. Valid values:
|
CRITICAL |
| Times |
integer |
No |
The number of times that the alert condition must be met before an alert notification is sent. |
3 |
| Labels |
array<object> |
No |
The labels that are written to the metric and displayed in alert notifications when the metric meets the alert condition. Note
This feature is the same as the Label feature in Prometheus alerting. |
|
|
object |
No |
None. |
||
| Key |
string |
No |
The label key. |
tagKey1 |
| Value |
string |
No |
The label value. Note
The label value supports template parameters. Template parameters are replaced with actual label values. |
ECS |
| Prometheus |
object |
No |
The Prometheus alert configuration. Note
Set this parameter only when you create a Prometheus alert rule for Hybrid Cloud Monitoring. |
|
| PromQL |
string |
No |
The PromQL query statement. Note
The data obtained by the PromQL query statement is the alert data. Include the alert threshold in this statement. |
cpuUsage{instanceId="xxxx"}[1m]>90 |
| Level |
string |
No |
The alert level. Valid values:
|
CRITICAL |
| Times |
integer |
No |
The number of times that the alert condition must be met before an alert notification is sent. |
3 |
| Annotations |
array<object> |
No |
The annotations for Prometheus alerting. The annotation keys and values are rendered to help you understand the metric or alert rule. Note
This feature is equivalent to the Annotation feature in Prometheus. |
|
|
object |
No |
None. |
||
| Key |
string |
No |
The annotation key. |
summary |
| Value |
string |
No |
The annotation value. |
{{ $labels.instance }} CPU usage above 10% {current value: {{ humanizePercentage $value }} } |
| SendOK |
boolean |
No |
Specifies whether to send a recovery notification. |
true |
For information about common request parameters, see Common parameters.
Response elements
|
Element |
Type |
Description |
Example |
|
object |
None. |
||
| Code |
string |
The HTTP status code. Note
A value of 200 indicates success. |
200 |
| Message |
string |
The error message. |
The request processing has failed due to some unknown error. |
| RequestId |
string |
The request ID. |
65D50468-ECEF-48F1-A6E1-D952E89D9436 |
| Success |
boolean |
Indicates whether the operation was successful. Valid values:
|
true |
Examples
Success response
JSON format
{
"Code": "200",
"Message": "The request processing has failed due to some unknown error.",
"RequestId": "65D50468-ECEF-48F1-A6E1-D952E89D9436",
"Success": true
}
Error codes
|
HTTP status code |
Error code |
Error message |
Description |
|---|---|---|---|
| 400 | %s | %s | |
| 499 | %s | %s | |
| 500 | InternalError | The request processing has failed due to some unknown error. | |
| 204 | %s | %s | |
| 403 | %s | %s | |
| 206 | %s | %s | |
| 404 | %s | %s | |
| 503 | %s | %s | |
| 406 | %s | %s | |
| 429 | ResourceOverLimit | The resource has exceeded the limit. %s |
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.