Problem management
A problem aggregates related alert events into a single unit based on grouping rules, enabling centralized lifecycle management and notifications.
Overview
Problem management provides:
-
Alert event aggregation: Groups related events into problems based on grouping rules.
-
Lifecycle management: Tracks problems from creation to resolution.
-
Escalation notifications: Supports multi-stage escalation policies and repeated notifications.
-
Collaborative handling: Supports claiming and transferring problems, and adding related personnel.
View problems
-
Log on to the Cloud Monitor 2.0 console, select the target workspace, and in the left-side navigation pane, choose .
-
The page lists all problems and their statuses. Filter by:
Field
Description
Severity
Critical, Error, Warning, or Normal
Problem title / ID
Title is determined by the alert rule or event subscription name. ID is auto-generated and unique.
Notification policy
Notification policy that triggered the alert.
Assignee
Person assigned to resolve the problem.
Creation time
Time when the problem was created.
Resolution status
Current status. Possible values:
-
open: Problem is being processed and continues to receive alert events.
-
resolved: Problem was manually resolved.
-
recovered: Problem was auto-resolved after no new events were received within a specified period.
Actions
Available actions, such as claim and resolve.
-
-
The Problem Details page shows basic information, affected objects, entity topology, problem content, root cause analysis, events, and the activity log. Available actions:
-
For an unresolved problem, claim or resolve it, assign it, or change its severity.
-
View the root cause identified by the Cloud Monitor 2.0 StarOps agent.
-
The Events and Activities tabs:
-
Events: Lists alert events with their creation times and statuses. Click an event name to view details.
-
Activities: Shows the problem activity log.
-
-
Handle problems
Claim or resolve unresolved problems, assign them, or change their severity.
-
On the Problems page, click Associate Operator in the upper-right corner.
-
The operator name is the DingTalk nickname. Because multiple users may share one Alibaba Cloud account, you must associate an operator to identify who handles each alert.
-
-
Scan the QR code with DingTalk and bind your mobile number.
-
To the right of the target problem, or, after clicking the problem, on the Problem Details page:
-
Click Claim to assign the problem to yourself.
-
Click Resolve to close the problem.
-