Lists all check items in Governance Maturity Assessment Model 3.0, with quick fix availability and assisted decision-making support.
Security
|
Category |
Check Item |
Description |
Quick Fix Description |
Assisted Decision-Making Support |
|
Personnel identity management |
Multi-factor authentication (MFA) is not enabled for the Alibaba Cloud account |
Enable multi-factor authentication (MFA) for your Alibaba Cloud account for an additional layer of security. A configuration without MFA is noncompliant. |
Not supported. |
No |
|
Personnel identity management |
A Resource Access Management (RAM) user has both console logon and an AccessKey enabled |
In accordance with the principle of least privilege, a RAM user should not have both console logon and an AccessKey. If this condition exists, the configuration is noncompliant. If single sign-on (SSO) is enabled for the account, the console logon setting is ignored. However, the configuration is still considered noncompliant if the user has an AccessKey and has logged on to the console within the last 7 days. |
This fix disables console logon for the selected RAM user. Before you apply the fix, confirm that the user no longer requires console access. |
Yes |
|
Personnel identity management |
A RAM user does not have MFA enabled |
MFA provides an additional layer of security for RAM users. If a RAM user has console logon enabled but does not have MFA enabled, the configuration is noncompliant. |
Not supported. |
Yes |
|
Personnel identity management |
RAM is not used for identity management |
The Alibaba Cloud account has extensive permissions, and a compromise would pose a high security risk. You should use RAM identities for daily operations. If no RAM identities exist, the configuration is noncompliant. |
Not supported. |
No |
|
Personnel identity management |
A RAM user does not meet the password strength requirements |
Enforcing strong password policies reduces the risk of credential stuffing and brute-force attacks. The configuration is noncompliant if password requirements for length, character types, expiration, history, or retry limits are not enforced. |
This fix updates the password strength settings in RAM. The settings are configured as follows: a minimum password length of 8 characters, a requirement for at least three character types, a maximum password age of 90 days, and a limit of five logon attempts per hour. These are best practice recommendations. You can adjust the parameters to enforce stricter requirements. After the setup is complete, the settings apply to all RAM users. |
No |
|
Personnel identity management |
The Alibaba Cloud account has logged on to the console within the last 90 days |
The Alibaba Cloud account has extensive permissions that cannot be restricted by conditions such as source IP or time. A compromise of this account would pose a high security risk. If the account has been used to log on to the console within the last 90 days, the configuration is noncompliant. |
Not supported. |
No |
|
Personnel identity management |
An inactive RAM user exists |
RAM users with console logon enabled use passwords to log on. The longer a password exists, the higher the risk of exposure. If a RAM user has not logged on for more than 90 days, the configuration is noncompliant. |
This fix disables console logon for the selected RAM user. Before you apply the fix, confirm that the user no longer requires console access. Note: If SSO is enabled, disabling console logon does not remove the user from the resource list. To clear the alert, you must delete the RAM user. |
Yes |
|
Personnel identity management |
RAM SSO is not enabled for console logon |
Use SSO to centrally manage user identities and reduce security risks. The configuration is noncompliant if RAM SSO is not configured or if no SSO logons have occurred in the last 30 days. |
Not supported. |
No |
|
Personnel identity management |
Unified management for multi-account identities is recommended |
Use Cloud SSO to manage all users across your organization. You can configure your enterprise identity provider (IdP) for Alibaba Cloud SSO and set uniform access permissions for member accounts in your Resource Directory (RD). If Cloud SSO has not been used for more than 90 days, the configuration is noncompliant. |
Not supported. |
No |
|
Personnel identity management |
RAM SCIM user synchronization is not enabled |
System for Cross-domain Identity Management (SCIM) synchronizes enterprise identities to Alibaba Cloud, which eliminates the need for manual user creation. The configuration is noncompliant if SCIM is not configured or if synchronized users have not logged on for two months. |
Not supported. |
No |
|
Programmatic identity management |
An active AccessKey exists for the Alibaba Cloud account |
An AccessKey for an Alibaba Cloud account grants full permissions to the account. It cannot be restricted by conditions such as source IP or time. A leak of this AccessKey would pose a high security risk. If an active AccessKey exists for the account, the configuration is noncompliant. |
This fix disables the selected AccessKey for the Alibaba Cloud account. Before you apply the fix, confirm that the AccessKey is not used by any programs or applications. Disabling the AccessKey partially improves the security score, but the alert persists until the AccessKey is deleted. Note: Only the Alibaba Cloud account can perform this fix. An attempt to perform this fix using a RAM user or RAM role will fail. |
Yes |
|
Programmatic identity management |
A RAM user has two active AccessKeys |
A RAM user with two active AccessKeys cannot rotate them, which increases security risks. If a RAM user has two active AccessKeys, the configuration is noncompliant. |
This fix disables the selected AccessKey for the RAM user. Before you apply the fix, confirm that the AccessKey is not used by any programs or applications. |
Yes |
|
Programmatic identity management |
An exposed and unhandled AccessKey exists |
If an AccessKey is exposed, attackers can use it to access your resources and data. If an unhandled AccessKey exposure event exists, the configuration is noncompliant. |
Not supported. |
No |
|
Programmatic identity management |
A KMS key is scheduled for deletion (new in model 3.0) |
When a customer master key (CMK) is deleted, it cannot be recovered. Data that was encrypted with the CMK and its related data keys becomes permanently undecryptable. To prevent accidental deletion and service disruptions, ensure that active CMKs are not scheduled for deletion. If a CMK is scheduled for deletion, the configuration is noncompliant. |
This fix cancels the scheduled deletion of the selected KMS key. The key's status changes from Scheduled for Deletion to Enabled. After the key is enabled, it can be used to encrypt and decrypt data, and standard billing charges apply. |
No |
|
Programmatic identity management |
An AccessKey has not been rotated regularly |
Regularly rotating AccessKeys reduces their exposure time and lowers the risk of leaks. If a RAM user's AccessKey has been in use for more than 365 days, the configuration is noncompliant. |
Not supported. |
No |
|
Programmatic identity management |
An inactive AccessKey exists |
A RAM user's AccessKey allows API access to Alibaba Cloud. The longer an AccessKey is exposed, the higher the risk of a leak. If an AccessKey has not been used for more than 365 days, the configuration is noncompliant. |
This fix disables the selected AccessKey for the RAM user. Before you apply the fix, confirm that the AccessKey is not used by any programs or applications. Disabling the AccessKey partially improves the security score, but the alert persists until the AccessKey is deleted. |
Yes |
|
Programmatic identity management |
A Redis instance does not have password authentication enabled (new in model 3.0) |
If a Redis instance in a VPC does not have password authentication enabled or has misconfigured security settings, it can lead to data leaks, unauthorized malicious actions, network security issues, or service interruptions. If password authentication is disabled for a Redis instance in a VPC, the configuration is noncompliant. |
Not supported. |
No |
|
Programmatic identity management |
Programmatic access does not use an AccessKey-free solution |
The configuration is noncompliant if ECS instances do not have instance roles, ACK clusters do not have the RRSA plug-in enabled, or Function Compute services do not have service roles. |
Not supported. |
No |
|
Programmatic identity management |
An ECS instance does not use the hardened Metadata Service (V2) |
Use the hardened Metadata Service (V2) for ECS instances to prevent potential Security Token Service (STS) token leaks that can occur with V1. If an ECS instance uses V1, the configuration is noncompliant. To resolve this issue, upgrade the Metadata Service to V2. |
Not supported. |
No |
|
Permission management |
Too many RAM identities have the AdministratorAccess permission |
In accordance with the principle of least privilege, you should limit the number of RAM identities that are granted the AdministratorAccess permission. This permission allows full control over all resources, and granting it to too many identities increases the impact of a potential identity leak. The configuration is compliant if three or fewer RAM identities have the AdministratorAccess permission. |
This fix replaces the AdministratorAccess permission with the PowerUserAccess permission. PowerUserAccess grants full access to Alibaba Cloud services and resources but does not include permissions to manage RAM identities, Resource Directory, shared resources, or financial account information. The fix analyzes permission audit logs and automatically replaces the AdministratorAccess permission with the PowerUserAccess permission for unused administrator identities. |
No |
|
Permission management |
Too many non-administrator RAM identities have high-risk billing permissions |
In accordance with the principle of least privilege, you should limit high-risk billing permissions. Permissions such as `bss:*`, `bssapi:*`, `bss:PayOrder`, `bss:Modify*`, `bss:Create*`, `bss:*Order*`, and `bss:Delete*` allow users to modify orders, invoices, contracts, bills, transactions, and withdrawals. Mismanagement of these permissions can cause financial loss. The configuration is compliant if three or fewer non-administrator RAM identities have these permissions. |
Not supported. |
No |
|
Permission management |
Too many non-administrator RAM identities have high-privilege permissions |
In accordance with the principle of least privilege, you should limit high-privilege permissions. These permissions allow RAM identities to escalate their own permissions or the permissions of others. Misuse of these permissions can compromise the security and confidentiality of your resources. The configuration is compliant if three or fewer non-administrator RAM identities have high-privilege permissions. A full list of high-privilege permissions is available in the documentation. |
Not supported. |
No |
|
Permission management |
A non-administrator RAM identity has decryption permissions for all KMS keys |
KMS lets you control who can use your keys and access your encrypted data. RAM policies define the actions that identities, such as users, groups, or roles, can perform on specific resources. Follow security best practices by granting only the required permissions and restricting access to specific keys. Granting the `kms:Decrypt` permission for all keys creates an excessive security risk. Instead, identify the minimum set of keys required and grant access only to those keys. For example, you can allow the `kms:Decrypt` action only on specific keys in specific regions. This approach minimizes the risk of data exposure. |
Not supported. |
No |
|
Permission management |
A RAM identity has idle product-level permissions |
Product-level permissions are considered idle if they are not used for a specified period after they are granted. This can occur due to role changes or if the initial permissions granted were too broad. As a best practice, you should reclaim idle permissions to achieve fine-grained authorization. If a RAM identity has idle product-level permissions for 180 days, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Permission management |
A RAM identity has idle high-risk operation permissions |
High-risk operation permissions, such as creating RAM users, are considered idle if they are not used for a specified period after they are granted. This can occur due to role changes or if the initial permissions granted were too broad. As a best practice, you should promptly reclaim idle high-risk permissions to prevent security incidents. If a RAM identity has idle high-risk operation permissions for 180 days, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Permission management |
All RAM identities have the AdministratorAccess permission |
In accordance with the principle of least privilege, you should avoid granting the AdministratorAccess permission to all RAM identities to limit the impact of a potential identity leak. The configuration is compliant if at least one RAM identity has non-administrator permissions. |
This fix replaces the AdministratorAccess permission with the PowerUserAccess permission. PowerUserAccess grants full access to Alibaba Cloud services and resources but does not include permissions to manage RAM identities, Resource Directory, shared resources, or financial account information. The fix analyzes permission audit logs and automatically replaces the AdministratorAccess permission with the PowerUserAccess permission for unused administrator identities. |
No |
|
Permission management |
Access Analyzer is not used for permission management (new in model 3.0) |
Access Analyzer helps you identify resources that are shared with external accounts and detects unexpected sharing to reduce security risks. It also identifies over-permissioned identities and generates analysis reports. |
Not supported. |
No |
|
Permission management |
Using control policies for multi-account boundary protection is recommended |
Resource Directory control policies allow organizations to restrict the cloud services and operations that member accounts can access. This helps centralize permission boundary management and ensures compliance with security standards. The configuration is noncompliant if no custom control policy is created and attached to a Resource Directory folder or member account. |
Not supported. |
No |
|
Permission management |
No RAM user inherits permissions from a RAM user group |
By default, RAM users, groups, and roles cannot access any resources. You must grant permissions using RAM policies. To simplify management and reduce the risk of accidental permission expansion, you should apply policies to groups or roles instead of individual users. The configuration is compliant with best practices if at least one RAM user inherits permissions from a RAM user group. |
Not supported. |
No |
|
Log collection and archiving |
ActionTrail logs are not retained for a long term |
The configuration is noncompliant if no trail is created, or if an existing trail does not archive events from all regions, does not archive all read and write operations, or retains logs for less than 180 days. |
This fix enhances the settings of an existing trail to include all management read and write events and events from all regions. You must select at least one existing trail to update. New events that are generated after the fix is applied are delivered to the trail's destination storage. Historical events are not affected. |
No |
|
Log collection and archiving |
EDAS does not have log collection configured (new in model 3.0) |
Alibaba Cloud Enterprise Distributed Application Service (EDAS) integrates with Simple Log Service (SLS) to collect application logs and container stdout logs from Kubernetes clusters for querying and analysis. If log collection is not configured for EDAS, the configuration is noncompliant. |
Not supported. |
No |
|
Log collection and archiving |
An OSS bucket does not have real-time log query enabled (new in model 3.0) |
The real-time logging feature of OSS records access to buckets for auditing, monitoring, and analysis purposes. The configuration is noncompliant if real-time logging is disabled or if the log retention period is 180 days or less. |
Not supported. |
No |
|
Log collection and archiving |
An OSS bucket does not have log delivery enabled (new in model 3.0) |
OSS access logs can generate a large volume of data. The log delivery feature writes hourly log files to a specified bucket using a fixed naming convention. You can analyze the delivered logs using Simple Log Service or Spark clusters. The configuration is compliant if log delivery is enabled for an OSS bucket. |
Not supported. |
No |
|
Log collection and archiving |
An ESA site does not have a log delivery task configured |
This check verifies that at least one log type is configured for the site to provide real-time access logs for monitoring, analysis, and content delivery optimization. If no log types are configured, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Log collection and archiving |
Cloud Firewall logs are not collected and stored for 180 days or more |
Cloud Firewall automatically logs all traffic and provides visualized audit pages for attack events, traffic details, and operation logs. The default log retention period is 7 days. For compliance and enhanced security, you should store logs for 180 days or more. If you purchased a subscription edition of Cloud Firewall but do not store logs for 180 days or more, the configuration is not compliant with network security best practices. |
This fix enables log analysis for Cloud Firewall and sets the log retention period. To comply with data protection and network security regulations, the retention period must be 180 days or more. After this feature is enabled, Cloud Firewall creates a dedicated Project and Logstore to store all logs. Fees apply based on retention duration and storage capacity. Billing details. |
No |
|
Log collection and archiving |
Centralized aggregation of multi-account operation logs is recommended |
ActionTrail stores events for 90 days by default. Creating trails helps you persist operation records for compliance purposes. Multi-account trails allow administrators to centrally track and audit logs across multiple accounts. If no multi-account trail exists, the configuration is noncompliant. |
Not supported. |
No |
|
Log collection and archiving |
Enabling centralized log collection for multi-account environments is recommended |
If the trusted service for Simple Log Service (SLS) log audit is disabled, the configuration is noncompliant. |
Not supported. |
No |
|
Compliance checks |
Cloud resources are noncompliant |
If Cloud Config rules detect noncompliance and the compliant resource rate is below 100%, the configuration is noncompliant. |
Not supported. |
No |
|
Compliance checks |
Cloud Config is not enabled |
If Cloud Config is not enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Compliance checks |
Cloud Config compliance rules are not enabled |
If Cloud Config compliance rules are not enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Compliance checks |
Cloud Config compliance rules do not cover all cloud resources |
If the rule coverage is below 100%, the configuration is noncompliant. |
Not supported. |
No |
|
Compliance checks |
Enabling unified configuration checks for multi-account environments is recommended |
The configuration is noncompliant if either of the following conditions is not met: 1. An account group exists and has rules configured. 2. Noncompliance events are delivered to SLS. |
Not supported. |
No |
|
Compliance checks |
Compliance check data is not retrieved regularly |
If Cloud Config does not deliver noncompliance events or if the results were not viewed in the last 7 days, the configuration is noncompliant. |
This fix creates a resource data delivery task in the current account. It delivers configuration and compliance snapshots and changes to SLS, OSS, or MNS in a specified Alibaba Cloud account for persistence and notifications. Delivery tasks are free, but standard data delivery charges apply. Billing details. |
No |
|
Compliance checks |
Resource change or snapshot delivery is not configured |
If Cloud Config does not deliver resource changes or snapshots, the configuration is noncompliant. |
This fix creates a resource data delivery task in the current account. It delivers configuration and compliance snapshots and changes to SLS, OSS, or MNS in a specified Alibaba Cloud account for persistence and notifications. Delivery tasks are free, but standard data delivery charges apply. Billing details. |
No |
|
Compute resource protection |
Host baseline issues require remediation |
Viruses and hackers exploit server configuration weaknesses to steal data or install backdoors. Baseline checks assess OS, database, software, and container configurations. Remediating baseline issues strengthens security, reduces intrusion risk, and helps meet compliance requirements. If any unremediated baselines exist, the configuration is noncompliant. |
Not supported. |
No |
|
Compute resource protection |
Vulnerabilities require remediation |
Vulnerability management is a continuous and proactive process. It protects systems, networks, and applications from cyberattacks and data breaches. Timely remediation prevents attacks and minimizes damage. If any unremediated vulnerabilities exist, the configuration is noncompliant. |
Not supported. |
No |
|
Compute resource protection |
Ransomware protection is not enabled |
Ransomware encrypts business data, which can cause service interruptions, data leaks, and data loss. Configuring ransomware protection reduces these risks. If ransomware protection is purchased but no protection policy is created, the configuration is noncompliant. |
Not supported. |
No |
|
Compute resource protection |
Antivirus is not enabled |
Antivirus scans clean malicious threats, including ransomware, DDoS trojans, mining malware, backdoors, and worms. If antivirus is purchased for Security Center (Enterprise or Ultimate Edition) but no periodic scanning policy is configured, the configuration is noncompliant. |
This fix configures periodic virus scanning in Security Center. After the fix is applied, scanning runs on all eligible servers according to the configured interval and time window. Virus alerts appear in the security alert module. You should address them promptly to ensure server security. |
No |
|
Compute resource protection |
Container image scanning is not configured |
Images with system or application vulnerabilities, or images that have been replaced with malicious versions, can introduce risks. Scanning images in registries helps security teams push findings to developers for remediation. If image scanning is purchased but no scan scope is configured, the configuration is noncompliant. |
Not supported. |
No |
|
Application runtime protection |
An ESA site does not have WAF managed rules configured |
This check verifies that sites have WAF managed rules to better protect web and API applications. If no managed rules are configured, the configuration is not compliant with Web Application Protection best practices. |
Managed rules are built-in intelligent rules for ESA. They protect against OWASP attacks and emerging origin vulnerabilities. This fix enables the managed rule set for the selected ESA site. Rule availability varies by edition (supported features by edition). |
No |
|
Application runtime protection |
An ESA site does not have WAF custom rules configured |
This check verifies that sites have WAF custom rules to detect and mitigate malicious requests. If no custom rules are configured, the configuration is not compliant with Web Application Protection best practices. |
Not supported. |
No |
|
Application runtime protection |
Application protection configurations are not created |
Runtime detection and protection defend Java applications against zero-day vulnerabilities. If application protection is purchased but no configurations or groups exist, the configuration is noncompliant. |
Not supported. |
No |
|
Application runtime protection |
Web Tamper Protection is not enabled |
Web Tamper Protection monitors website directories or files in real time and restores tampered content from backups. It prevents illegal content injection and ensures site availability. If Web Tamper Protection is purchased but no servers are bound, the configuration is noncompliant. |
Not supported. |
No |
|
Network attack response |
Anti-DDoS Pro or Anti-DDoS Premium exceeds defense thresholds (new in model 3.0) |
If attack traffic exceeds the protection bandwidth of an Anti-DDoS instance, the instance enters black hole mode. All traffic routed through it is blocked, which makes services inaccessible. If the Anti-DDoS instance's IP status is Black Hole Activated, the configuration is noncompliant. |
Not supported. |
No |
|
Network attack response |
An ECS instance exceeds DDoS defense thresholds (new in model 3.0) |
If an ECS instance suffers high-volume DDoS attacks that exceed its defense bandwidth, Alibaba Cloud's black hole policy blocks traffic between the instance and the Internet to prevent broader damage and protect other assets. If an ECS instance with an Open Public IP Address has a DDoS protection status of Black Hole Activated, the configuration is noncompliant. |
Not supported. |
No |
|
Network attack response |
An EIP exceeds DDoS defense thresholds (new in model 3.0) |
If an EIP suffers high-volume DDoS attacks that exceed its defense bandwidth, Alibaba Cloud's black hole policy blocks traffic between the EIP and the Internet to prevent broader damage and protect other assets. If an EIP's DDoS protection status is Black Hole Activated, the configuration is noncompliant. |
Not supported. |
No |
|
Network attack response |
An SLB instance exceeds DDoS defense thresholds (new in model 3.0) |
If an SLB instance suffers high-volume DDoS attacks that exceed its defense bandwidth, Alibaba Cloud's black hole policy blocks traffic between the instance and the Internet to prevent broader damage and protect other assets. If an SLB instance's DDoS protection status is Black Hole Activated, the configuration is noncompliant. |
Not supported. |
No |
|
Network attack response |
Native DDoS protection does not have protected objects added |
After you purchase Native DDoS Protection or Anti-DDoS instances, you must add public IP assets as protected objects to enable DDoS protection. Without this step, the protection is ineffective and the costs are wasted. |
Not supported. |
No |
|
Network attack response |
AI-based intelligent protection for websites is set to Strict mode |
AI-based intelligent protection improves website security. However, Strict mode may cause false positives. Use Strict mode only for websites with poor performance or inadequate protection. Note: Website domains have built-in Layer 4 attack protection. For most websites, use Normal mode to balance protection and business continuity. |
This fix sets AI-based intelligent protection to Block Mode with Normal severity. Anti-DDoS Proxy will automatically generate accurate access control rules and intelligently defend against malicious attacks when threats are detected. |
No |
|
Network access control |
An ECS instance is not prohibited from being bound to a public IP address |
To reduce the risk of attacks, avoid exposing ECS instances directly to the public Internet. Use NAT Gateway or Server Load Balancer instead. If an ECS instance is bound to a public IP address, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
An Elasticsearch instance has a public endpoint enabled and no IP address whitelist |
Exposing Elasticsearch to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks and configure appropriate IP whitelists. If a public endpoint is enabled, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Network access control |
An Elasticsearch instance's Kibana service has a public endpoint enabled and no IP address whitelist |
Exposing Kibana to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks and configure appropriate IP whitelists. If Kibana's public endpoint is enabled, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Network access control |
A MongoDB instance has a public endpoint enabled and no IP address whitelist |
Exposing databases to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks, configure IP whitelists, and use strong credentials. If a public endpoint is enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
A PolarDB cluster has a public endpoint configured |
Exposing databases to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks, configure IP whitelists, and use strong credentials. If a public endpoint is enabled, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Network access control |
An RDS instance has a public endpoint enabled and no IP address whitelist |
Exposing databases to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks, configure IP whitelists, and use strong credentials. If a public endpoint is enabled, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Network access control |
A Redis instance has a public endpoint configured |
Exposing databases to the public Internet poses security risks. It becomes visible to attackers and may suffer data leaks or destruction without proper access controls. As a best practice, allow access only from VPC internal networks, configure IP whitelists, and use strong credentials. If a public endpoint is enabled, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Network access control |
A security group inbound rule allows access from 0.0.0.0/0 to any port |
Do not allow all IP addresses (0.0.0.0/0) to access any port. You should restrict access to specific IP ranges and ports. If a security group inbound rule allows access from 0.0.0.0/0 without specifying ports, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
A Tablestore instance has public and classic network access enabled |
Tablestore creates a public domain name, a VPC domain name, and a classic network domain name for each instance. Public domain names are accessible from the Internet. Classic network domain names are accessible from ECS instances in the same region. As a best practice, allow access only from the console or a VPC. Restricting public and classic network access improves network isolation and data security. If the network type is set to "Console or VPC only" or "VPC only", the configuration is compliant with best practices. |
Not supported. |
No |
|
Network access control |
An ACK cluster API server has a public endpoint enabled (new in model 3.0) |
Enabling a public endpoint for an ACK cluster increases the risk of attacks on resources such as pods, services, and replica controllers. Do not enable public endpoints. If a public endpoint is enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
An ECS launch template network type is set to classic network (new in model 3.0) |
The classic network offers no network-level isolation between users. Multiple tenants share the same IP pool, and users cannot customize network topologies or IP addresses. Vulnerabilities in classic network applications may expose them to other tenants. Virtual Private Cloud (VPC) provides stronger security. For organizations that prioritize data security, VPC is the better choice. If an ECS launch template uses the classic network, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
An EMR cluster master node has a public endpoint enabled (new in model 3.0) |
Assigning a public IP address to an EMR master node increases its exposure to attacks. Attackers may scan, infiltrate, or perform other malicious actions, which threatens the entire cluster. If a public endpoint is enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
An ESS scaling group's associated security group inbound rule allows access from 0.0.0.0/0 to any port (new in model 3.0) |
When scaling activities are triggered, Auto Scaling creates ECS instances using the scaling configuration. If the inbound rule of the associated security group allows all IP addresses (0.0.0.0/0) on any port, the new instances face security risks. If such a rule exists, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
A MaxCompute project does not have an IP address whitelist (new in model 3.0) |
With an IP address whitelist, only listed devices can access the project. Without a whitelist, any device that uses a public endpoint can access the project, which poses exposure risks. If the project allows external network access without an IP address whitelist, the configuration is noncompliant. |
Not supported. |
No |
|
Network access control |
A security group exposes high-risk ports (22, 3389, etc.) to the public Internet |
Do not allow public Internet access to high-risk ports such as SSH (22) and RDP (3389) to prevent attacks and unauthorized access. If such ports are exposed, the configuration is noncompliant. |
Not supported. |
No |
|
Network protection |
A NAT Gateway instance is not fully protected by the NAT Border Firewall |
To reduce the exposure of the private network to the public Internet, all NAT Gateway instances must be protected by Cloud Firewall's NAT Border Firewall. If Cloud Firewall is used but some NAT Gateways lack protection, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Inter-VPC traffic is not fully protected by the VPC Border Firewall |
All inter-VPC traffic must pass through Cloud Firewall's VPC Border Firewall to reduce internal traffic risks. If Cloud Firewall is used but some inter-VPC traffic lacks protection, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall's intrusion prevention system (IPS) does not have basic defense enabled |
Enable basic defense in Cloud Firewall's intrusion prevention system (IPS). It provides foundational protection, including blocking brute-force attacks, command execution exploits, and C&C communications. If Cloud Firewall is used but basic defense is disabled, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall's intrusion prevention system (IPS) does not have threat intelligence enabled |
Enable threat intelligence in Cloud Firewall's IPS to scan for threats and block malicious activity. If Cloud Firewall is used but threat intelligence is disabled, the configuration is noncompliant. |
Not supported. |
No |
|
Network protection |
Cloud Firewall's intrusion prevention system (IPS) does not have Block Mode enabled |
Configure Cloud Firewall's IPS in Block Mode to intercept malicious traffic and stop intrusions. If Cloud Firewall is used but Block Mode is disabled, the configuration is not compliant with network security best practices. |
This fix enables Block Mode for the threat engine. The severity defaults to Medium. |
No |
|
Network protection |
Cloud Firewall's intrusion prevention system (IPS) does not have virtual patching enabled |
Enable virtual patching in Cloud Firewall's IPS. It provides real-time protection against critical and emergency vulnerabilities at the network layer, which prevents exploitation without disrupting services. If Cloud Firewall is used but virtual patching is disabled, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall has insufficient available authorizations |
This check verifies that Cloud Firewall's authorization count is sufficient. If Cloud Firewall is used but the number of unprotected public IP assets exceeds the number of available authorizations, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall does not have a default deny policy configured |
For network security, configure a default deny policy (an IPv4 policy with the source and destination set to 0.0.0.0/0 and the action set to Deny). This blocks all traffic except explicitly allowed trusted traffic. If Cloud Firewall is used but no default deny policy is configured, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall does not protect all public IP assets |
This check verifies that all public IP assets are protected by Cloud Firewall. If Cloud Firewall is used but some public IP assets lack Internet Border Firewall protection, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall does not have an access control list (ACL) policy configured |
After you enable the firewall, if no ACL policies are configured, Cloud Firewall defaults to allowing all traffic. You should configure ACL policies to control unauthorized access. If Cloud Firewall is used but no ACL policies exist, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall's protection bandwidth is insufficient |
This check verifies that Cloud Firewall's protection bandwidth is sufficient. If Cloud Firewall is used but the peak bandwidth over the last 30 days exceeds the purchased bandwidth, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Network protection |
Cloud Firewall is not used to protect network traffic |
Cloud Firewall is a SaaS-based firewall that provides unified security isolation for Internet borders, VPC borders, and host borders. It is the first line of defense for cloud workloads. If Cloud Firewall is not used, the configuration is not compliant with network security best practices. |
Not supported. |
No |
|
Data access control |
An OSS bucket has public-write permissions enabled |
OSS supports public access using bucket policies and ACLs. Public-write means that anyone can modify or upload objects without permissions or authentication. This risks data leaks and malicious access that can lead to high costs. As a best practice, disable public-write permissions. Access OSS data only using signed URLs or APIs. If the bucket policy or ACL contains public-write semantics, the bucket is at risk and is not compliant with best practices. |
Not supported. |
No |
|
Data access control |
An OSS bucket has anonymous account access rules |
Applying the principle of least privilege reduces security risks and limits the impact of errors or malicious behavior. If an OSS bucket policy allows anonymous access, attackers may exfiltrate data. If external accounts are compromised, your data may be altered or deleted, which threatens its integrity, confidentiality, and business continuity. As a best practice, block anonymous access using policies. If the bucket policy grants access to * (all accounts) with an Allow effect, it is not compliant with best practices. |
Not supported. |
No |
|
Data access control |
An OSS bucket has access rules for accounts outside the organization |
Ensure that OSS buckets are accessible only to internal accounts to prevent data leaks. If the bucket's authorization policy allows external accounts, the configuration is noncompliant. |
Not supported. |
No |
|
Data access control |
An OSS bucket has public-read permissions enabled |
Prevent public read access to OSS bucket contents to ensure data confidentiality and security. If public-read permissions are enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Data-in-transit protection |
An SLB server certificate is at risk of expiring |
Ensure that SLB server certificates do not expire within 15 days to avoid encryption failures. If the remaining validity is 15 days or less, the configuration is noncompliant. |
Not supported. |
No |
|
Data-in-transit protection |
An API in API Gateway with a public endpoint does not have HTTPS configured |
Using only HTTP for public APIs poses data security risks. HTTP transmits data in plaintext, which allows attackers to view sensitive information such as credentials and private data. As a best practice, use HTTPS for public APIs and force HTTP-to-HTTPS redirects to ensure encrypted transmission. If the API Gateway domain does not have HTTPS configured, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Data-in-transit protection |
A CDN SSL certificate is nearing expiration (new in model 3.0) |
Ensure that domain-bound SSL/TLS certificates remain valid to avoid security risks and service disruptions. If the CDN certificate expires in less than 15 days, the configuration is noncompliant. |
Not supported. |
No |
|
Data-in-transit protection |
CDN domains do not have forced HTTP-to-HTTPS redirection configured |
Using only HTTP for CDN domains poses data security risks. HTTP transmits data in plaintext, which allows attackers to view sensitive information such as credentials and private data. As a best practice, use HTTPS for CDN domains and force HTTP-to-HTTPS redirects to ensure encrypted transmission. If the CDN domain's forced redirect type is not set to HTTP→HTTPS, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Data-in-transit protection |
CDN domains do not have HTTPS configured |
Using only HTTP for CDN domains poses data security risks. HTTP transmits data in plaintext, which allows attackers to view sensitive information such as credentials and private data. As a best practice, use HTTPS for CDN domains and force HTTP-to-HTTPS redirects to ensure encrypted transmission. If HTTPS secure acceleration is not enabled for the CDN domain, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Data-in-transit protection |
An Elasticsearch instance does not use HTTPS |
Using only HTTP for Elasticsearch poses data security risks. HTTP transmits data in plaintext, which allows attackers to view sensitive information. As a best practice, use HTTPS to access Elasticsearch from applications or clients to ensure encrypted transmission. If the cluster network setting enables HTTPS, the configuration is compliant with best practices. |
Not supported. |
No |
|
Data-in-transit protection |
An SLB instance does not have HTTPS listeners enabled |
Ensure that Server Load Balancer (SLB) has HTTPS listeners enabled to encrypt data in transit using TLS. If HTTPS listeners are not enabled, the configuration is noncompliant. |
Not supported. |
No |
|
Data-in-transit protection |
A certificate in Certificate Management Service is at risk of expiring |
After an SSL certificate expires, clients cannot verify the server identity, which causes access failures or warnings. Untimely renewal may reduce service availability, erode customer trust, or cause data leaks. Allow ample time for renewal to avoid disruptions. If a certificate in Certificate Management Service expires in 15 days or less, the configuration is not compliant with best practices. |
Not supported. |
No |
|
Data-in-transit protection |
An ESA site does not have TLS v1.2 enabled |
This check verifies that sites have TLS v1.2 enabled to improve security. If it is not enabled, the configuration is not compliant with data-in-transit security best practices. |
Not supported. |
No |
|
Data-in-transit protection |
An ESA site does not have HTTP Strict Transport Security (HSTS) enabled |
This check verifies that sites have HSTS enabled to reduce risks from first-visit hijacking. If it is not enabled, the configuration is not compliant with data-in-transit security best practices. |
Not supported. |
No |
|
Data-in-transit protection |
An ESA site does not have forced HTTPS enabled |
This check verifies that sites have forced HTTPS enabled to redirect HTTP requests from clients to ESA edge nodes to HTTPS. If it is not enabled, the configuration is not compliant with data-in-transit security best practices. |
Not supported. |
No |
|
Data masking |
Data Security Center does not have sensitive data identification enabled (new in model 3.0) |
Sensitive data includes customer data, technical documents, and personal information. Data Security Center scans databases for sensitive data using predefined rules and hit counts. Supported databases include MaxCompute, OSS, ApsaraDB services (RDS, PolarDB-X, PolarDB, OceanBase, and Tablestore), and self-managed databases. If sensitive data identification is disabled, the configuration is noncompliant. |
Not supported. |
No |
|
Data-at-rest protection |
A PolarDB cluster does not have transparent data encryption (TDE) enabled (new in model 3.0) |
TDE performs real-time I/O encryption and decryption on data files. Data is encrypted before it is written to a disk and decrypted when it is read into memory. If TDE is disabled for PolarDB, data leaks, unauthorized access, or tampering may occur. If TDE is disabled, the configuration is noncompliant. |
Not supported. |
No |
|
Data-at-rest protection |
An RDS instance does not have transparent data encryption (TDE) enabled (new in model 3.0) |
Use TDE for real-time I/O encryption and decryption in security-compliant or data-at-rest encryption scenarios. TDE encrypts data at the database layer, which prevents attackers from reading sensitive data directly from storage. If TDE is disabled for RDS, the configuration is noncompliant. |
Not supported. |
No |
|
Security incident response and recovery |
Security Center has pending alerts |
Security alerts indicate threats that are detected by Security Center on your servers or cloud products. Alert types include Web Tamper Protection, abnormal processes, web shells, abnormal logons, and malicious processes. Addressing alerts improves your security posture. If pending alerts exist, the configuration is noncompliant. |
Not supported. |
No |
|
Security incident response and recovery |
Security Center is not used for security protection (new in model 3.0) |
Cloud assets face threats such as viruses, cyberattacks, ransomware, and vulnerability exploits. Security Center provides asset management, configuration checks, and proactive defense. You should purchase appropriate security services to build a defense system. If the Security Center edition is Basic or lower, the configuration is noncompliant. |
Not supported. |
No |
Stability
|
Category |
Check item |
Check item description |
Quick fix description |
Supports assisted decision-making |
|
Instance types |
ACK cluster uses the Basic Edition of managed cluster |
ACK managed clusters are divided into Basic Edition and Pro Edition. The Pro Edition further enhances the reliability, security, and scheduling of the cluster compared to the Basic Edition, which makes it more suitable for running large-scale services in a production environment. An account is considered non-compliant if the Pro Edition of the managed cluster type is not used. |
Quick fix is not supported. |
No |
|
Instance types |
ECS instance uses a shared or discontinued instance type |
Using a shared or discontinued instance type for an ECS instance cannot guarantee stable computing performance. Using a shared or discontinued ECS instance family is considered non-compliant. |
Quick fix is not supported. |
No |
|
Instance types |
Elasticsearch instance uses a development and test instance type |
An Elasticsearch instance with 1 core and 2 GB of memory is suitable only for testing scenarios and not for production environments. Using an Elasticsearch instance with 1 core and 2 GB of memory is considered non-compliant. |
Quick fix is not supported. |
No |
|
Instance types |
MongoDB instance uses a single-node instance type |
When MongoDB adopts a single-node architecture, the fault recovery time is long and there is no SLA guarantee. Using a MongoDB instance that is not multi-zone is considered non-compliant. |
Quick fix is not supported. |
No |
|
Instance types |
RDS instance uses a Basic series instance type |
An RDS Basic series instance has only one database node and no secondary node as a hot backup. Therefore, when the node unexpectedly fails or performs tasks such as restarting the instance, changing the configuration, or upgrading the version, it will be unavailable for a long time. At the same time, the shared and general-purpose instance types in the RDS instance family share resources with other instances on the same physical machine and are suitable only for application scenarios with low stability requirements. If the business has high availability requirements for the database, it is recommended to use the High-availability/Cluster series for the product series and the Dedicated type for the instance family. An RDS instance is considered non-compliant if the RDS product series does not use the High-availability/Cluster series, or the RDS instance family does not use the Dedicated type. |
Quick fix is not supported. |
No |
|
Instance types |
Redis instance uses an open source edition instance type |
Redis Enterprise Edition provides stronger performance, more data structures, and more flexible storage methods. Not using Redis Enterprise Edition is considered non-compliant. |
Quick fix is not supported. |
No |
|
Instance types |
ApsaraMQ for RocketMQ instance uses a Standard Edition instance type |
The Standard Edition of ApsaraMQ for RocketMQ uses a shared instance and is not recommended for use in a production environment. Using a shared edition of a RocketMQ instance is considered non-compliant. |
Quick fix is not supported. |
No |
|
Stable version |
ACK cluster uses an expired Kubernetes version |
The Kubernetes community releases a minor version approximately every 4 months. We recommend using a version that is still under maintenance. Expired version clusters have security and stability risks. After a cluster version expires, you cannot enjoy the features and bug fixes supported by the new Kubernetes version, receive timely and effective technical support, or fix security vulnerabilities. Using an ACK cluster version that is still under maintenance is compliant. |
Quick fix is not supported. |
No |
|
Stable version |
ECS instance uses an expired OS version |
Using an OS version that is no longer supported for an ECS instance is considered non-compliant. |
Quick fix is not supported. |
No |
|
Stable version |
Elasticsearch instance uses a non-recommended version |
An Elasticsearch instance is considered non-compliant if the version it uses is not within the official recommended version range. |
Quick fix is not supported. |
No |
|
Stable versions |
MSE engine version is too low |
Using the latest MSE engine version is key to ensuring the service continuity of MSE. If the engine version is too low, it may cause problems such as code defects leading to GC not being reclaimed, memory overflow causing continuous memory growth, slow startup speed, and JSON serialization defects. An account is considered non-compliant if the MSE-ZooKeeper or MSE-ANS engine version or the MSE-ANS client version is too low. |
Quick fix is not supported. |
No |
|
Stable versions |
MSE-Ingress gateway version is too low |
Using the latest version of Ingress is key to ensuring the service continuity of the gateway. If the version is too low, it may cause problems such as security or stability risks, and may lead to an inaccurate instance list for subscribing to Nacos services. An account is considered non-compliant if the MSE-Ingress version is too low. |
Quick fix is not supported. |
No |
|
Stable versions |
RDS instance MySQL database major version is too low (New in Model 3.0) |
Using a MySQL version whose lifecycle has stopped or is about to stop exposes the system to problems such as security risks, performance bottlenecks, compatibility issues, and lack of technical support. Upgrading to a supported MySQL version in a timely manner can provide the latest security patches, performance improvements, and feature enhancements, which reduces O&M risks and improves overall system reliability. An RDS instance is considered non-compliant if it uses version 5.5 or 5.6. |
Quick fix is not supported. |
No |
|
Stable version |
Function Compute (FC) 2.0 function uses a deprecated runtime (New in Model 3.0) |
As runtime versions iterate, Function Compute stops maintaining some runtimes and no longer provides technical support or security updates for them. We recommend migrating functions to the latest supported runtime to obtain technical support and security updates. An FC 2.0 function is considered non-compliant if it uses any of the following runtimes: nodejs12, nodejs10, nodejs8, dotnetcore2.1, python2.7, nodejs6, or nodejs4.4. |
Quick fix is not supported. |
No |
|
Stable versions |
ACK cluster inspection finds that the Kubelet component version on a node lags behind the control plane (New in Model 3.0) |
If the Kubelet component version on an ACK cluster node lags behind the control plane, it can cause compatibility issues. The control plane, such as the API Server, may be unable to communicate with an older version of Kubelet due to new features or protocol upgrades. This can lead to abnormal node status, Pod scheduling failures, or nodes being marked as unavailable. In addition, older versions of Kubelet may contain unfixed security vulnerabilities, which increases the risk of node attacks and hinders the cluster's overall upgrade capability. To restore communication stability and eliminate security risks, you must upgrade Kubelet to a compatible version. A node with a Kubelet version that lags behind the control plane is considered non-compliant. |
Quick fix is not supported. |
No |
|
Stable version |
PolarDB cluster database does not use a stable minor version |
A PolarDB database is considered non-compliant if its minor version status is not Stable or Beta. |
This fix enables automatic upgrade of the minor version for the specified instance. When your minor engine version is lower than the latest minor engine version, the system periodically issues active O&M tasks to upgrade the minor engine version. The automatic upgrade operation is performed within the maintenance window you set. During the upgrade process, the database proxy (PolarProxy) or the kernel engine (DB) is restarted, which may cause a transient connection. Perform the upgrade operation during off-peak hours and ensure that your application has an automatic reconnection mechanism. |
No |
|
Stable version |
RDS instance does not have automatic minor engine version upgrade enabled (New in Model 3.0) |
ApsaraDB RDS supports automatic or manual upgrade of the minor engine version. When the minor engine version is lower than the latest minor engine version, the system periodically issues active O&M tasks to upgrade the minor engine version. The instance obtains the latest version including performance improvements, new feature support, and security issue fixes, which can ensure the continuous optimization and security of the database service. An RDS instance is considered non-compliant if it does not have automatic minor engine version upgrade enabled. |
This fix automatically enables the automatic minor engine version upgrade for the selected RDS instance. When the minor engine version of the RDS instance is lower than the latest minor engine version, the system periodically issues active O&M tasks to upgrade the minor engine version. The automatic upgrade operation is performed within the maintenance window you set. The upgrade task information issued by the system is notified through channels such as text messages and emails set in the Message Center. |
No |
|
Stable version |
Redis instance has not been upgraded to the latest minor version |
A Redis instance is considered non-compliant if it has not been upgraded to the latest minor version. |
This fix enables automatic upgrade of the minor version for the selected instance. After enabling, the system periodically checks the version release status. If a new version is found, it is automatically upgraded within the upgradeable period of 60 days. When upgrading the database version, the instance first upgrades the secondary (Replica) instance or prepares a new instance. At the specified execution time, a primary/secondary switchover or instance switchover is performed to complete the upgrade operation. During the instance switchover phase, the instance is in a read-only state for up to 60 seconds while waiting for data to be fully synchronized, and there is a transient connection of a few seconds. Ensure that your application has a reconnection mechanism. |
No |
|
Expiration risks |
AnalyticDB for MySQL Data Warehouse Edition instance is at risk of expiration |
An AnalyticDB for MySQL Data Warehouse Edition instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected ADB subscription instance resource. |
No |
|
Expiration risks |
Anti-DDoS instance is at risk of expiration |
A DDoS instance is considered non-compliant if it is due to expire in less than 7 days from the current time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected DDoSCOO subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
ECS instance is at risk of expiration |
An ECS subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected ECS subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
EIP instance is at risk of expiration |
An EIP subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected EIP subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
KMS instance is at risk of expiration (New in Model 3.0) |
Ensure timely renewal for KMS subscription instances to avoid business interruptions due to expiration. A KMS subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected KMS subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
MongoDB instance is at risk of expiration |
A MongoDB subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected MongoDB subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
PolarDB cluster is at risk of expiration |
A PolarDB subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected PolarDB subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
PolarDB-X instance is at risk of expiration |
A PolarDB-X 1.0 or PolarDB-X 2.0 instance is considered non-compliant if it is due to expire in less than 7 days from the current time and auto-renewal is not enabled. |
Quick fix is not supported. |
No |
|
Expiration risks |
RDS instance is at risk of expiration |
An RDS subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected RDS subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
Redis instance is at risk of expiration |
A Redis subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected Redis subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
SLB instance is at risk of expiration |
An SLB subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected SLB subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
VPN Gateway is at risk of expiration (New in Model 3.0) |
Ensure timely renewal for VPN Gateway subscription instances to avoid business interruptions due to expiration. A VPN Gateway subscription instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected VPN subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
CEN bandwidth plan is at risk of expiration |
A Cloud Enterprise Network bandwidth plan is considered non-compliant if it is due to expire in less than 7 days from the current time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected CEN subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
Shared bandwidth instance is at risk of expiration |
A shared bandwidth instance is considered non-compliant if it is due to expire in less than 7 days from the current time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected CBWP resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Expiration risks |
Bastionhost instance is at risk of expiration |
A Bastionhost instance is considered non-compliant if it is due to expire in less than 7 days from the check time and auto-renewal is not enabled. |
This fix enables auto-renewal for your selected Bastionhost subscription instance resource. After auto-renewal is enabled, the feature takes effect the next day. Enable auto-renewal at least 2 days before the subscription instance expires. If your instance is due to expire the next day, we recommend you go to the product console to manually renew. The auto-renewal cycle is based on the set auto-renewal duration. For example, if you select a renewal duration of 1 month, the instance is automatically renewed for 1 month before each expiration. Ensure that your account balance, cash coupons, or other payment methods are sufficient to cover the renewal amount. |
No |
|
Deletion protection |
Deletion protection is not enabled for ACK cluster |
An ACK cluster is considered non-compliant if deletion protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Deletion protection is not enabled for ALB instance |
An ALB instance is considered non-compliant if deletion protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Deletion protection is not enabled for EIP instance |
An EIP instance is considered non-compliant if deletion protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Release protection is not enabled for MongoDB instance |
A MongoDB instance is considered non-compliant if release protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Cluster lock is not enabled for PolarDB cluster |
A PolarDB instance is considered non-compliant if cluster lock is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Release protection is not enabled for RDS instance |
An RDS instance is considered non-compliant if release protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Deletion protection |
Deletion protection is not enabled for SLB instance |
An SLB instance is considered non-compliant if deletion protection is not enabled. |
This fix enables deletion protection for the selected resource. The resource cannot be released through the console, API, or command line. To release the instance, first go to the instance details page to turn off the deletion protection switch. |
No |
|
Risk inspection |
ACK cluster inspection finds that the CLB instance bound to the API Server does not exist (New in Model 3.0) |
If an ACK cluster API Server is not bound to a Classic Load Balancer (CLB) instance, it results in a lack of a traffic entry point for the API service. External clients such as kubectl cannot access the API Server through load balancing, which completely interrupts cluster management. Cluster components such as kubelet and controllers may cause abnormal node status, Pod scheduling failures, and service unavailability due to the inability to establish stable communication. At the same time, the API Server node directly exposes its IP address, losing traffic distribution and failover capabilities, which poses a single point of failure risk and increases the threat of unauthorized access or DDoS attacks. A CLB instance must be created and bound immediately to restore high availability and secure access. An ACK cluster is considered non-compliant if its API Server is not bound to a CLB instance. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the CLB instance bound to the API Server is in an abnormal state (New in Model 3.0) |
If the CLB instance bound to an ACK cluster API Server is in an abnormal state, it causes API service traffic forwarding to fail. Clients such as kubectl cannot establish a stable connection, which completely blocks cluster management. Internal components such as kubelet and controllers cause abnormal node status, Pod scheduling stagnation, and service unavailability due to communication interruption. At the same time, CLB health check failure may cause traffic to be concentrated on faulty nodes, which exacerbates single point of failure risks. If the abnormal state is accompanied by security configuration errors such as no encryption or exposed ports, it may lead to unauthorized access or man-in-the-middle attacks. The CLB health status must be repaired immediately and the security policy verified to avoid cluster paralysis and data breaches. An ACK cluster is considered non-compliant if the CLB instance bound to its API Server is in an abnormal state. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the CLB port listener configuration bound to the API Server is abnormal (New in Model 3.0) |
If the CLB port listener configuration bound to an ACK cluster API Server is abnormal, it causes API service access to be interrupted. Clients such as kubectl cannot connect to the cluster, and O&M operations completely fail. At the same time, internal components such as kubelet and controllers cause abnormal node status, Pod scheduling failures, and service unavailability due to the inability to communicate with the API Server. If the listener protocol is incorrect or security group restrictions are missing, it may lead to unauthorized access or traffic hijacking risks. The listener port configuration must be repaired immediately, and the protocol type and security policy verified to avoid cluster paralysis and data breaches. An ACK cluster is considered non-compliant if the CLB port listener configuration bound to its API Server is abnormal. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the number of backend servers for the CoreDNS service is 0 (New in Model 3.0) |
If the number of backend servers for CoreDNS in an ACK cluster is 0, it causes service discovery to completely fail. Inter-service communication within the cluster, such as microservice calls and database access, is interrupted, and applications cannot resolve addresses through service names, which directly affects business availability. It also causes cluster stability risks. An ACK cluster is considered non-compliant if the number of backend servers for its CoreDNS service is 0. |
Quick fix is not supported. |
No |
|
Risk inspection |
ALB 5xx error rate is too high (New in Model 3.0) |
If the 5xx error rate of an ALB instance continues to exceed a specified threshold for a period of time, it indicates that the backend service is frequently experiencing internal errors. This may be caused by application exceptions, insufficient resources, configuration errors, or dependent service failures. This directly leads to a decline in user experience, an increase in the risk of business interruption, and affects system stability and availability. An ALB instance is considered non-compliant if its 5xx error rate is greater than or equal to 80% for at least 8 hours within a certain time range in the past. |
Quick fix is not supported. |
No |
|
Risk inspection |
ALB TLS handshake failure rate is too high (New in Model 3.0) |
A high ALB TLS handshake failure rate may indicate that there are problems with encrypted communication between the client and the server, such as certificate configuration errors, incompatible protocol versions, mismatched key suites, or the client using an unsupported encryption algorithm. This not only causes user access to fail and affects business availability, but may also expose security vulnerabilities and increase the risk of man-in-the-middle attacks. An ALB instance is considered non-compliant if its TLS handshake failure rate is greater than or equal to 80% for at least 8 hours within a certain time range in the past. |
Quick fix is not supported. |
No |
|
Risk inspection |
ALB connection failure rate is too high (New in Model 3.0) |
A high connection failure rate for an Application Load Balancer (ALB) may indicate that the backend service is abnormal, the network is unstable, or the configuration is incorrect. This may lead to user access failure, business interruption, and a decline in user experience. By inspecting the ALB connection failure rate metric, you can promptly discover and locate the root cause of the problem, which improves system availability and stability, optimizes traffic scheduling efficiency, and ensures business continuity and service quality. This brings more reliable cloud application delivery capabilities to customers. An ALB instance is considered non-compliant if its connection failure rate is greater than or equal to 80% for at least 8 hours within a certain time range in the past. |
Quick fix is not supported. |
No |
|
Risk inspection |
OSS origin domain name configuration in CDN domain name is abnormal (New in Model 3.0) |
If the origin domain name configuration in a CDN domain name does not exist, it causes resource requests to fail and affects business functions. At the same time, origin fetch failure causes CDN to continuously retry, which increases meaningless network overhead. A CDN domain name is considered non-compliant if it uses an OSS domain name in its origin information, and the corresponding OSS bucket resource status is not "In Use". CDN domain names that do not use an OSS domain name as origin information are not included in the detection scope. |
Quick fix is not supported. |
No |
|
Risk inspection |
OSS domain name configured in CNAME record in DNS domain name resolution is abnormal (New in Model 3.0) |
If an incorrect OSS domain name is configured in the CNAME record of DNS domain name resolution, accessing resources through that domain name causes the resources to fail to load normally, which affects normal business functions. A DNS domain name is considered non-compliant if a CNAME record in DNS is configured with an OSS domain name, and the corresponding OSS bucket is not "In Use". DNS domain names that do not use an OSS domain name in their CNAME records are not included in the detection scope. |
Quick fix is not supported. |
No |
|
Risk inspection |
Custom image configured in ECS instance launch template is abnormal (New in Model 3.0) |
An instance launch template is a tool for quickly creating instances, which improves efficiency and user experience. When the custom image configured in the launch template does not exist, it causes the launch template execution to fail. An ECS instance launch template is considered non-compliant if the custom image associated with it is not an "In Use" resource. |
Quick fix is not supported. |
No |
|
Risk inspection |
Load balancer associated with ESS scaling group is abnormal (New in Model 3.0) |
After a scaling group is associated with a load balancer instance, whether the scaling group automatically creates instances or instances are manually added to the scaling group, the instances are automatically added to the backend servers of the load balancer instance. If the load balancer or load balancer server group does not exist, it causes the scaling group to fail to scale. An Auto Scaling group is considered non-compliant if the Classic Load Balancer or Application Load Balancer associated with it is not an "In Use" resource. |
Quick fix is not supported. |
No |
|
Risk inspection |
Delay between RDS read-only instance and primary instance is too large (New in Model 3.0) |
An RDS read-only instance uses MySQL's native log-based replication technology (asynchronous replication or semi-synchronous replication), so there is inevitably a synchronization latency. The latency causes data inconsistency between the read-only instance and the primary instance, which leads to business problems. In addition, the latency may also cause log stacking, which quickly consumes the space of the read-only instance. An RDS read-only instance is considered non-compliant if the maximum delay between it and the primary instance exceeds 60 seconds within 7 days. |
Quick fix is not supported. |
No |
|
Risk inspection |
Insufficient number of available IPs in VPC instance (New in Model 3.0) |
Ensure that the VPC vSwitch has a sufficient number of available IPs to avoid being unable to expand the business due to insufficient resources. A VPC vSwitch is considered non-compliant if the number of available IPv4 IPs is less than or equal to a specified value (default is 10). |
Quick fix is not supported. |
No |
|
Risk inspection |
ECS instance has been shut down due to overdue payment or security ban |
A passive shutdown of an ECS instance causes service interruption, data loss, and data inconsistency, and affects system performance or creates security risks. An account is considered at risk if there is an ECS instance under the current account that has been shut down due to an overdue payment or a security ban. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the backend status of the API Server CLB instance is abnormal (New in Model 3.0) |
If the backend status of an ACK cluster API Server CLB instance is abnormal, it causes the control plane communication to be interrupted, which directly leads to a complete failure of cluster management. Clients such as kubectl cannot access the API Server, making it impossible to perform operations such as deploying applications and viewing status. At the same time, components such as kubelet and controllers trigger abnormal node status, Pod scheduling failures, and automatic recovery mechanism failures due to disconnection from the API Server. This then causes the cluster stability to collapse and business services to be interrupted due to API inaccessibility. In addition, monitoring tools such as Prometheus cannot collect metric data, making it impossible to promptly alert and troubleshoot abnormalities. More seriously, long-term unavailability of the API Server may cause inconsistency between the cluster status and the data stored in etcd, which leads to data loss or operation abnormalities. The CLB configuration, backend node health status, and network connectivity must be checked immediately to ensure normal traffic distribution and avoid a complete cluster crash. An ACK cluster is considered non-compliant if the backend status of its API Server CLB instance is abnormal. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that APIService is unavailable (New in Model 3.0) |
If an ACK cluster APIService is unavailable, it causes the extended API function to fail. Custom resources such as CRDs cannot communicate with the control plane, which leads to management abnormalities for components that rely on extended APIs, such as Operators and service meshes. API requests such as resource status updates and configuration delivery fail due to service interruption, which may cause monitoring data loss, automatic policy failure, or cluster management command errors. If core extended APIs such as Admission Webhooks are affected, it blocks the resource creation process, which exacerbates the risk of cluster operation blockage. The APIService must be restored urgently to avoid paralysis of key functions and data inconsistency. An ACK cluster is considered non-compliant if its APIService is unavailable. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the LoadBalancer Service billing method is inconsistent with the actual instance (New in Model 3.0) |
If the billing method of an ACK cluster LoadBalancer service does not match the actual instance, it causes billing abnormalities. This may lead to unexpected delivery, such as pay-as-you-go instead of the expected subscription, or unexpected resource release, such as not renewing a subscription upon expiration, which causes service interruption. At the same time, chaotic resource management interferes with automatic scaling policies, which increases O&M costs and risks. The billing method configuration must be calibrated immediately to avoid billing discrepancies and a decline in business availability. An ACK cluster is considered non-compliant if the billing method of its LoadBalancer service is inconsistent with the actual instance. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the LoadBalancer Service certificate instance ID is inconsistent with the actual instance (New in Model 3.0) |
If the certificate instance ID of an ACK cluster LoadBalancer service does not match the actually bound certificate, it causes the TLS configuration to fail. This leads to HTTPS service connection rejection or security warnings, and user access interruption. An invalid certificate may expose unencrypted traffic, which increases the risk of man-in-the-middle attacks. At the same time, abnormal health checks misjudge the status of backend services, which exacerbates traffic distribution chaos. The certificate configuration must be synchronized immediately to restore secure communication and service availability. An ACK cluster is considered non-compliant if the certificate instance ID of its LoadBalancer service is inconsistent with the actual instance. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds an abnormal CoreDNS Pod (New in Model 3.0) |
An abnormal CoreDNS Pod in an ACK cluster causes the DNS resolution service to be unstable. Communication between services through domain names may time out or fail, which leads to application call interruptions. An abnormal Pod may trigger the controller to restart continuously, which increases the load on the control plane and occupies node resources without providing effective services. If the Pod is abnormal due to configuration errors or image vulnerabilities, it may cause DNS hijacking or resolution pollution, which leads to service routing errors or data breaches. The Pod status must be checked and the configuration repaired immediately to restore the reliability of the DNS service. An ACK cluster is considered non-compliant if an abnormal CoreDNS Pod exists. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the elastic component status is abnormal (New in Model 3.0) |
If the elastic component status of an ACK cluster is abnormal, it causes mechanisms such as automatic scaling and automatic fault recovery to fail. It cannot dynamically scale out during high load, which leads to resource bottlenecks, service response delays, or interruptions. It cannot automatically replace faulty nodes or Pods, which exacerbates availability risks. At the same time, the cluster cannot optimize resource allocation according to policies, which results in cost waste or a decline in O&M efficiency. In the long run, it may block key business processes. The elastic component status must be repaired urgently to restore the cluster's adaptive capabilities. An ACK cluster is considered non-compliant if its elastic component status is abnormal. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the node pool vSwitch is unavailable (New in Model 3.0) |
If an ACK cluster node pool vSwitch is unavailable, it causes network communication between nodes to be interrupted. Pods and services cannot interact across nodes, which leads to service discovery failures or data transmission stagnation. Communication between the control plane and worker nodes is disconnected, and nodes are marked as unavailable, which may trigger erroneous evictions or abnormal cluster size reduction. At the same time, nodes cannot access external storage, database, and other resources, which causes application functions to be paralyzed. The risk of network partitioning is exacerbated, which may cause a cluster split-brain or data inconsistency. From an O&M perspective, it is impossible to locate faults in a timely manner due to the interruption of monitoring data. The vSwitch service must be restored urgently to ensure network connectivity. An ACK cluster is considered non-compliant if its node pool vSwitch is unavailable. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the node pool scaling group is unavailable (New in Model 3.0) |
If an ACK cluster node pool scaling group is unavailable, it causes the cluster to completely lose its automatic scaling capabilities. It cannot dynamically scale out during high load, which leads to node resource exhaustion, Pod scheduling failures, or service response delays. It cannot scale in during low load, which results in resource idleness and cost waste. The automatic replacement mechanism for faulty nodes fails, which may cause nodes to be offline for a long time and exacerbates the risk of a single point of failure in the cluster. At the same time, an abnormal scaling group hinders the cluster's ability to elastically respond to sudden traffic or maintenance needs. In the long run, it leads to a decline in service stability and O&M efficiency. The scaling group status must be repaired immediately to restore the cluster's elastic capabilities. An ACK cluster is considered non-compliant if its node pool scaling group is unavailable. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the node pool scaling configuration is unavailable (New in Model 3.0) |
If an ACK cluster node pool scaling configuration is unavailable, it causes the cluster to be unable to automatically adjust the number of nodes. It cannot scale out during high load, which leads to resource exhaustion, Pod scheduling failures, or service interruptions. It cannot scale in during low load, which results in resource waste and a surge in costs. At the same time, the automatic replacement mechanism for faulty nodes fails, which may cause nodes to be unavailable for a long time and reduces the high availability of the cluster. In the long run, it also causes automatic policies such as HPA (Horizontal Pod Autoscaler) to fail, which leads to an imbalanced cluster state and an increase in O&M costs. The scaling configuration must be repaired immediately to restore elastic capabilities. An ACK cluster is considered non-compliant if its node pool scaling configuration is unavailable. |
Quick fix is not supported. |
No |
|
Risk inspection |
ACK cluster inspection finds that the node pool security group is unavailable (New in Model 3.0) |
If an ACK cluster node pool security group is unavailable, it causes network access rules to fail. Communication between cluster components such as kubelet and API Server, and service discovery between Pods may be interrupted due to port blocking or missing rules. At the same time, unauthorized traffic may break through the protection, which increases the risk of node intrusion or DDoS attacks. If outbound rules are abnormal, nodes cannot access external storage, image repositories, or monitoring services, which leads to dependent service call failures. Security group failure also causes nodes to be mistakenly isolated, which affects Pod scheduling and business continuity. The rule configuration must be repaired immediately to restore network isolation and communication security. An ACK cluster is considered non-compliant if its node pool security group is unavailable. |
Quick fix is not supported. |
No |
|
Risk inspection |
ALB 4xx error rate is too high (New in Model 3.0) |
If the 4xx error rate of an ALB instance continues to exceed a specified threshold for a period of time, it usually means that there are many exceptions in client requests, such as invalid requests, parameter errors, authentication failures, or high access frequency such as DDoS attacks. This not only affects the user experience of normal users, but may also expose system interface design defects or security risks. An ALB instance is considered non-compliant if its 4xx error rate is greater than or equal to 80% for at least 8 hours within a certain time range in the past. |
Quick fix is not supported. |
No |
|
Risk inspection |
CEN instance is not configured with VBR health check (New in Model 3.0) |
The health check feature of Cloud Enterprise Network detects the connectivity of the Express Connect circuit associated with a VBR instance. In scenarios where there are redundant routes between Cloud Enterprise Network and a data center, the health check supports automatic switching to an available route after detecting an Express Connect circuit failure, which ensures uninterrupted traffic transmission. A CEN instance is considered non-compliant if a VBR associated with it is not set up with a health check. |
Quick fix is not supported. |
No |
|
Risk inspection |
SPF record in DNS domain email resolution is abnormal (New in Model 3.0) |
SPF is a DNS-based email validation protocol used to define which mail servers such as IP addresses or domain names are authorized to send emails on behalf of a domain. When a mail server receives an email, it verifies the sender's IP address against the allowed list in the SPF record to determine if the email is legitimate. Setting a reasonable and valid SPF value can prevent email spoofing, reduce the risk of spam, and improve email delivery rate. For each MX record in a DNS domain, check if there is at least one TXT record with a valid SPF value that starts with v=spf1. A DNS domain that does not meet the above conditions is considered non-compliant. |
Quick fix is not supported. |
No |
|
Risk inspection |
ECS instance has pending O&M events |
Failure to respond to and handle scheduled O&M events for ECS in a timely manner may cause the ECS instance to restart during peak business hours, which affects the stability of the services on the ECS instance. An account is considered at risk if there are pending ECS O&M events with a status of Inquiring, Scheduled, or Executing. |
Quick fix is not supported. |
No |
|
Risk inspection |
OSS bucket is not set with a custom domain name |
Using a custom domain name can enhance brand image and professionalism, and improve stability. A custom domain name can be bound through CNAME to achieve CDN acceleration, which improves access performance. It also supports HTTPS secure access, which enhances data transmission security. An OSS bucket is considered non-compliant if it is not set with a custom domain name. |
Quick fix is not supported. |
No |
|
Risk inspection |
Data replication for RDS for PostgreSQL instance does not use synchronous or semi-synchronous mode (New in Model 3.0) |
RDS for PostgreSQL supports three data replication modes: asynchronous, synchronous, and semi-synchronous. The asynchronous mode has the fastest response speed, but is suitable only for scenarios with low data persistence requirements. Data loss may occur when the database crashes, which poses a persistence risk. An RDS for PostgreSQL instance is considered non-compliant if it uses the asynchronous mode for data replication (synchronous_commit parameter is off). |
This fix changes the data replication mode of the RDS for PostgreSQL database to semi-synchronous or synchronous mode. The synchronous mode provides the maximum protection level and is suitable for scenarios with extremely high data persistence requirements, but the response speed is slow. The semi-synchronous mode provides the highest availability protection level, which balances data persistence and response speed. To change the data replication mode to semi-synchronous, the instance kernel version must be 20220228 or later. The action to modify the parameter is executed within the maintenance window set for the instance. |
No |
|
Risk inspection |
Connection usage of Redis instance is too high (New in Model 3.0) |
If the connection usage of a Redis instance continues to exceed a specified threshold for a period of time, it indicates that the current connection resources are close to or have reached their limit. This may cause new clients to be unable to establish connections, requests to be rejected, or response latency to increase, which affects business performance and stability. This situation may also indicate problems such as connection leaks, unreasonable connection pool configurations, or sudden traffic pressure. A Redis instance is considered non-compliant if its average connection usage is greater than or equal to 50% for at least 8 hours within a certain time range in the past. |
Quick fix is not supported. |
No |
|
Data backup and snapshots |
Log backup is not enabled for AnalyticDB for MySQL instance |
An AnalyticDB for MySQL cluster is considered non-compliant if log backup is not enabled. |
This fix enables log backup for the selected AnalyticDB for MySQL cluster, with a default storage period of 7 days. |
No |
|
Data backup and snapshots |
No available data backup set for AnalyticDB for PostgreSQL instance (New in Model 3.0) |
The data backup check for AnalyticDB for PostgreSQL aims to ensure that the instance has an available backup set to prevent business interruptions caused by data loss or misoperation. By periodically checking the backup policy and backup status, you can effectively improve data security and recovery capabilities. A running, non-Serverless type AnalyticDB for PostgreSQL storage instance is considered non-compliant if it has no available data backup set within a specified number of hours in the past (default is 7 days or 168 hours). |
After this is enabled, the AnalyticDB for PostgreSQL instance performs data backup according to the backup configuration, which generates an available backup set. AnalyticDB for PostgreSQL can restore a new instance to a historical point in time through a complete basic backup and continuous log backups, which ensures data security at that time. |
No |
|
Data backup and snapshots |
Data backup protection for ECS instance is at risk (New in Model 3.0) |
Different snapshot and backup solutions should be selected for different scenarios, such as daily data protection, high-risk operation escort, regional disaster protection, and full machine recovery. Otherwise, it may be impossible to recover data, and an incomplete backup solution also leads to the recoverability and efficiency of core files not meeting expectations. An ECS instance is considered non-compliant if it has not enabled any of the following backup solutions: 1. Enable cloud disk snapshot solution. 2. Configure "File/Self-managed Database Backup" backup solution. |
Quick fix is not supported. |
No |
|
Data backup and snapshots |
Automatic snapshot policy is not set for ECS disk |
An ECS disk is considered non-compliant if an automatic snapshot policy is not set for it. |
This fix enables the specified snapshot policy for the selected ECS disk instance. Because snapshot policies are independent in each region, if a policy with the same name exists in the region where the selected disk is located, the existing policy is used. Otherwise, a new snapshot policy is created. |
No |
|
Data backup and snapshots |
Log backup is not enabled for MongoDB instance |
A MongoDB instance is considered non-compliant if log backup is not enabled. |
This fix enables log backup for the selected MongoDB cluster, with a default storage period of 7 days. |
No |
|
Data backup and snapshots |
Data backup policy for NAS file system is at risk (New in Model 3.0) |
If the NAS recycle bin and Cloud Backup are not enabled, you cannot recover in time when your files are accidentally deleted or tampered with. If cross-region replication is not enabled for the backup vault, multi-version geo-redundancy cannot be achieved, and data cannot be restored in a different location, which seriously affects business continuity. A NAS file system is considered non-compliant if it has not enabled any of the following backup solutions: 1. Enable NAS recycle bin. 2. Enable NAS backup. |
This fix enables the NAS recycle bin feature. To avoid business disruption or permanent data loss caused by accidental deletion of files in a General-purpose NAS file system, we recommend you enable the recycle bin feature. After this is enabled, deleted files or directories are temporarily stored in the recycle bin and are permanently deleted after the specified retention period. You can restore these files and their metadata information such as UID, GID, and ACL during the retention period. |
No |
|
Data backup and snapshots |
Data backup policy for OSS bucket is at risk |
Data at the bucket level should be protected. If versioning is not enabled, historical versions of data overwrite and delete operations may not be saved. If a problem occurs, it is impossible to restore the Object stored in the bucket to a specific point in time. At the same time, if cross-region replication is not enabled, operations under the same or different accounts are not synchronized to another region. When a disaster or failure occurs, it seriously harms business continuity. An OSS bucket is considered non-compliant if it does not have at least one of the following backup solutions enabled: 1. Enable OSS versioning. 2. Enable OSS bucket backup. |
This fix enables versioning for the selected OSS instance. After versioning is enabled, overwrite and delete operations on data are saved as historical versions. If you accidentally overwrite or delete an Object, you can restore the Object stored in the bucket to any historical version at any time. |
No |
|
Data backup and snapshots |
Level-2 backup is not enabled for PolarDB cluster |
A PolarDB cluster is considered non-compliant if level-2 backup is not enabled, and the retention period is greater than or equal to 30. |
This fix sets the level-2 backup cycle and level-2 backup retention period (default is 30 days) for the selected PolarDB cluster. If level-2 backup is not currently enabled, it is automatically enabled. |
No |
|
Data backup and snapshots |
Log backup is not enabled for RDS instance |
An RDS instance is considered non-compliant if log backup is not enabled. |
This fix enables log backup for the selected RDS instance, with a default storage period of 7 days. |
No |
|
Data backup and snapshots |
Data backup policy for Tablestore instance is at risk (New in Model 3.0) |
If Tablestore backup and cross-region backup are not enabled, important data cannot be quickly restored in a simple, efficient, safe, and reliable way. Once a failure occurs, business continuity is seriously affected. A Tablestore instance is considered non-compliant if a backup solution is not enabled. |
Quick fix is not supported. |
No |
|
Data backup and snapshots |
ECI elastic instance pod has no data volume mounted |
An ECI elastic instance pod is considered non-compliant if it has no data volume mounted. |
Quick fix is not supported. |
No |
|
Data backup and snapshots |
Automatic backup is not enabled for Elasticsearch instance |
An Elasticsearch instance is considered non-compliant if automatic backup is not enabled. |
Performing this operation enables the automatic backup feature for the selected Elasticsearch instance. The system automatically backs up data according to the set backup cycle and time. If data is accidentally deleted or there is a logic error in the application, you can use the automatic backup recovery feature to restore the backup data from a specific point in time to the original ES instance, which ensures data security. Note that automatic backups retain snapshot data for only the last 7 days, and automatic backup data can be used only to restore to the original cluster. |
No |
|
Data backup and snapshots |
Incremental backup is not set for Redis instance |
A Redis instance (Tair Enterprise Edition) is considered non-compliant if incremental backup is not enabled. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ACK cluster has a single-zone deployment risk |
Using a regional cluster can achieve cross-region disaster recovery capabilities. An ACK cluster is compliant if it is a regional cluster with nodes distributed in 3 or more zones. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ALB instance has a single-zone deployment risk |
If only one zone is selected, it affects the ALB instance when this zone fails, which affects business stability. An ALB instance is compliant if it is a multi-zone instance. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Resources mounted to ALB server group are all in a single zone |
Adding resources from multiple zones to an ALB load balancer server group can ensure that even if one zone fails, the application can still run in other zones, which provides better fault tolerance. An ALB load balancer server group is compliant if the resources mounted to it are distributed across multiple zones. This rule does not apply if the ALB server group has no resources mounted or if the server group is of the IP or Function Compute type. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
API Gateway instance has a single-zone deployment risk |
We recommend using a multi-zone API Gateway instance, which has multi-zone disaster recovery capabilities. A gateway instance is compliant if it is multi-zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Distribution of ECS instances within a region is unbalanced across zones (New in Model 3.0) |
Deploying all ECS instances in the same zone poses a single point of failure risk. When that zone fails due to hardware damage, network interruption, or other issues, all ECS instances in that region become unavailable at the same time, which leads to business interruption. An account is considered non-compliant if all ECS instances in the same region are deployed in the same zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Flink instance does not use a cross-zone CU type |
We recommend enabling cross-zone for Flink CUs, which provides multi-zone disaster recovery capabilities. A Flink instance is considered non-compliant if it does not use a multi-zone CU. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
GWLB instance has a single-zone deployment risk |
We recommend enabling multi-zone for GWLB instances, which provides multi-zone disaster recovery capabilities. A Gateway Load Balancer instance is considered non-compliant if it is not multi-zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Resources mounted to GWLB server group are all in a single zone (New in Model 3.0) |
Mounting resources in a multi-zone server group can improve the system's disaster recovery capabilities and reduce the risk of business interruption. An account is considered non-compliant if a GWLB instance is single-zone, or if a server group used by a listener under a GWLB instance does not have resources from multiple zones added. This rule does not apply when there are no resources in the server group or the resource type is IP. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
MSE related components have a single-zone deployment risk |
We recommend adopting a multi-zone deployment architecture for MSE related components to improve their stability. An MSE related component is considered non-compliant if it is deployed in a single zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
MSE gateway has a single-zone deployment risk |
All instance replicas of the current gateway are deployed in the same zone (AZ). This deployment form does not have high availability capabilities, and your business may be damaged in extreme cases. Upgrade to the new version as soon as possible to discretize the gateway instances into multiple zones. An MSE Ingress gateway component is considered non-compliant if it has a single-zone architecture. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
MongoDB instance has a single-zone deployment risk |
Using a MongoDB instance that is not multi-zone is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
NLB instance has a single-zone deployment risk |
For Network Load Balancer instances, it is strongly recommended to configure multi-zone to meet multi-zone disaster recovery. Using a single-zone Network Load Balancer instance is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Resources mounted to NLB server group are all in a single zone |
We recommend adding resources from multiple zones to a Network Load Balancer server group, which provides multi-zone disaster recovery capabilities. A Network Load Balancer server group is compliant if its resources are distributed across multiple zones. This does not apply if there are no resources in the server group or the resource type is IP. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Zone-redundant storage is not enabled for OSS bucket |
An OSS bucket is considered non-compliant if zone-redundant storage is not enabled. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Hot standby storage cluster is not enabled for PolarDB cluster |
A PolarDB cluster is considered non-compliant if a hot standby storage cluster is not enabled and the data is distributed in a single zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
PrivateLink endpoint service has a single-zone deployment risk |
Configuring multiple zones for an endpoint service can greatly reduce the risk of service interruption, distribute traffic more evenly to avoid overloading a single zone, and provide nearest access, which reduces network latency and improves access speed. Configuring multiple zones for an endpoint service is compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
RDS instance has a single-zone deployment risk |
Using an RDS instance that is not multi-zone is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Redis instance has a single-zone deployment risk |
Using a Redis instance that is not multi-zone is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
SLB instance and its listener's server group have a single point of deployment risk |
An account is considered non-compliant if an SLB instance is single-zone, or if a server group used by a listener under an SLB instance does not have resources from multiple zones added. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ApsaraMQ for Kafka instance has a single-zone deployment risk |
When you are using a Professional Edition instance and selected only single-zone deployment during deployment, you can upgrade the cluster to a multi-zone architecture deployment by editing the secondary zone, which enhances the cluster's disaster recovery capabilities. An ApsaraMQ for Kafka instance is compliant if it is multi-zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Transit router has a single-zone deployment risk |
For existing transit routers, it is strongly recommended to configure multi-zone to meet multi-zone disaster recovery capabilities. A transit router is considered non-compliant if its VPC connection only sets up a vSwitch in one zone. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
AnalyticDB for PostgreSQL instance has a single-zone deployment risk |
We recommend enabling cross-zone disaster recovery for AnalyticDB for PostgreSQL instances. When the primary zone fails, the system automatically switches the secondary zone node to the primary node to continue providing services and ensure business continuity. An AnalyticDB for PostgreSQL instance is considered non-compliant if it does not have cross-zone disaster recovery enabled. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ClickHouse cluster has a single-zone deployment risk |
We recommend using a multi-zone ClickHouse cluster instance, which has multi-zone disaster recovery capabilities. A multi-zone ClickHouse cluster instance is compliant. Currently, only the community version is checked for multi-zone architecture. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
HBase instance has a single-zone deployment risk |
We recommend adopting a multi-zone deployment architecture, which has higher disaster recovery capabilities. An HBase instance is considered non-compliant if it does not adopt a multi-zone deployment. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Lindorm instance has a single-zone deployment risk |
We recommend deploying Lindorm instances in multiple zones. Multi-zone instances have higher disaster recovery capabilities. At the same time, Lindorm instances can achieve strong consistency of data between multiple zones, and can also return the fastest result under eventual data consistency, which improves the service quality of online businesses. A Lindorm instance is considered non-compliant if it does not adopt a multi-zone deployment. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
PolarDB-X 2.0 instance has a single-zone deployment risk |
We recommend using a multi-zone PolarDB-X 2.0 instance, which has multi-zone disaster recovery capabilities. A PolarDB-X 2.0 instance with a multi-zone architecture is compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Tablestore instance has a single-zone deployment risk |
A Tablestore instance is considered non-compliant if it does not use a multi-zone deployment. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ACK cluster inspection finds that CoreDNS has only one replica (New in Model 3.0) |
If an ACK cluster CoreDNS retains only a single replica configuration, it loses high availability. When the Pod fails, the DNS service is completely interrupted, which causes service domain name resolution within the cluster to fail and leads to interruption of communication between applications. A single-point architecture cannot tolerate node failures or maintenance operations. Service may be transiently interrupted during upgrades or restarts, and the risk of long-term operation is exacerbated. The number of replicas must be expanded immediately to ensure service redundancy and stability. An ACK cluster is considered non-compliant if its CoreDNS has only one replica. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Zone-redundant storage is not enabled for the OSS bucket associated with ACR |
We recommend using an Enterprise Edition ACR instance and associating it with an OSS bucket that has zone-redundant storage enabled. An ACR is considered non-compliant if it is associated with an OSS bucket with locally redundant storage. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ACS cluster has a single-zone deployment risk |
We recommend using a regional multi-zone ACS cluster, which has multi-zone disaster recovery capabilities. An ACS cluster is compliant if it is a regional cluster with nodes distributed in 3 or more zones. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Elasticsearch instance has a single-zone deployment risk |
Using an Elasticsearch instance that is not multi-zone is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
SLS Project does not use zone-redundant storage |
Simple Log Service provides two storage redundancy types: locally redundant storage and zone-redundant storage. These cover data redundancy mechanisms from a single zone to multiple zones to ensure data persistence and availability. A log project is compliant if it uses zone-redundant storage. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
VPN Gateway has a single-zone deployment risk |
For existing single-tunnel instances, it is strongly recommended that you enable AZ high availability in the console and configure dual tunnels to connect to the peer. A VPN is considered non-compliant if it uses a single-tunnel instance. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
VPN Gateway does not use dual-tunnel mode |
A dual-tunnel mode IPsec-VPN connection has a primary and a secondary tunnel. If the primary tunnel fails, traffic can be transmitted through the secondary tunnel, which improves the high availability of the IPsec-VPN connection. A dual-tunnel VPN Gateway is compliant if both primary and secondary tunnels are connected to the peer. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
Bastionhost has a single-zone deployment risk |
We recommend using the Enterprise Dual-Engine or SM-compliant Edition of Bastionhost to meet multi-zone disaster recovery capabilities. Using the Basic Edition of Bastionhost is considered non-compliant. |
Quick fix is not supported. |
No |
|
Multi-zone architecture |
ApsaraMQ for RocketMQ instance does not use the High-availability Cluster Edition |
We recommend using the High-availability Cluster Edition, which has multi-zone disaster recovery capabilities. An ApsaraMQ for RocketMQ 5.0 instance is considered non-compliant if it is not multi-zone. |
Quick fix is not supported. |
No |
|
Cluster architecture |
ESS scaling group is associated with only a single vSwitch |
By associating multiple vSwitches, a scaling group can improve the overall robustness, reliability, and performance of the application, which helps it better meet business requirements. If one vSwitch is inaccessible due to network problems or other conditions, user traffic can still access the application through other vSwitches. A scaling group is compliant if it is associated with at least two vSwitches. |
Quick fix is not supported. |
No |
|
Cluster architecture |
MSE related components have a single point of deployment risk |
For the MSE ZooKeeper component, we recommend scaling out to 3 or more nodes. For the Nacos-ANS component, we recommend scaling out to 3 or more nodes. An MSE related component is considered non-compliant if it is deployed on a single node. |
Quick fix is not supported. |
No |
|
Cluster architecture |
MSE gateway has a single point of deployment risk |
A single-node instance has an architectural risk. A single point of failure causes the service to be unavailable. We recommend scaling out to 2 or more nodes. An MSE Ingress component is considered non-compliant if it is deployed on a single node. |
Quick fix is not supported. |
No |
|
Cluster architecture |
PolarDB instance has a single point of deployment risk |
A PolarDB instance is considered non-compliant if it does not use the Cluster Edition or Multi-master Cluster Edition. |
Quick fix is not supported. |
No |
|
Cluster architecture |
Automatic primary/secondary switchover is not enabled for RDS instance (New in Model 3.0) |
When the primary node of an instance is abnormal and cannot be used, or when there is a potential risk in the instance and an emergency repair has been performed on the secondary node, RDS automatically triggers a primary/secondary switchover, which swaps the primary and secondary nodes. After the switchover, the instance endpoint remains unchanged, and the application automatically connects to the new primary node (the original secondary node), which ensures the high availability of the instance. An RDS instance is considered non-compliant if the automatic primary/secondary switchover feature is not enabled. |
This fix enables the automatic primary/secondary switchover feature for the RDS instance. When the primary node of the instance is abnormal and cannot be used, or when there is a potential risk in the instance and an emergency repair has been performed on the secondary node, RDS automatically triggers a primary/secondary switchover, which swaps the primary and secondary nodes. After the switchover, the instance endpoint remains unchanged, and the application automatically connects to the new primary node (the original secondary node), which ensures the high availability of the instance. |
No |
|
Cluster architecture |
Primary and secondary nodes of RDS cluster are not configured with the same instance size (New in Model 3.0) |
If the primary and secondary nodes of an RDS cluster are not configured with the same instance size, it may cause the secondary node to be unable to take over smoothly when the primary node fails, which leads to performance bottlenecks or service interruptions. In addition, different instance specifications may lead to resource mismatches, which affects data synchronization efficiency and recovery speed, and reduces the high availability and disaster recovery capabilities of the system. Detecting and ensuring that the primary and secondary node instance sizes are consistent helps to improve system stability, enhance failover capabilities, and ensure business continuity. This brings higher reliability and O&M controllability to customers. An RDS cluster is considered non-compliant if its primary and secondary nodes are configured with different instance sizes. |
Quick fix is not supported. |
No |
|
Cluster architecture |
Primary and secondary nodes of RDS cluster are not configured with the same instance type (New in Model 3.0) |
If the primary and secondary nodes of an RDS cluster are not configured with the same instance type, it may cause the secondary node to be unable to take over smoothly when the primary node fails, which leads to performance bottlenecks or service interruptions. In addition, different instance specifications may lead to resource mismatches, which affects data synchronization efficiency and recovery speed, and reduces the high availability and disaster recovery capabilities of the system. Detecting and ensuring that the primary and secondary node instance types are consistent helps to improve system stability, enhance failover capabilities, and ensure business continuity. This brings higher reliability and O&M controllability to customers. An RDS cluster is considered non-compliant if its primary and secondary nodes are configured with different instance types. |
Quick fix is not supported. |
No |
|
Cluster architecture |
High-reliability mode is not used for Express Connect (New in Model 3.0) |
Use the high-reliability mode of Express Connect to create two access points in the same region to achieve network redundancy, ensure the stability and reliability of data transmission, and meet compliance requirements. An Express Connect circuit is considered non-compliant if it has fewer than 2 access points in the same region. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Insufficient remaining storage capacity for RDS instance (New in Model 3.0) |
Insufficient remaining storage capacity for an RDS instance may lead to database write failures, performance degradation, and even service interruptions or data loss risks. It is necessary to expand capacity or clean up data in a timely manner to avoid business abnormalities, ensure database stability, and improve the reliability and foresight of O&M. An RDS instance is considered non-compliant if it has less than 10% of its remaining storage capacity for any 1 hour within 7 days. |
This fix enables the automatic storage expansion feature for the RDS instance. After this is enabled, the storage space is automatically expanded when it reaches the threshold. The instance does not need to be restarted during the expansion, and there is no impact on the business. |
No |
|
Quota and capacity |
ACK API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
ALB API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
CDN API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
ECS API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
NAS API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
PolarDB API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
RDS API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Redis API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
RocketMQ API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
SLB API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
VPC API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Cloud Enterprise Network (CEN) API calls are throttled |
API calls are throttled, which leads to call failures and may affect business stability. An account is considered non-compliant if there have been throttling exceptions for API calls in the last 7 days. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of ACK clusters is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of ACK clusters reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of ALB instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of ALB instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of CDN URL refreshes is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of CDN URL refreshes reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of accelerated domain names supported by CDN is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of accelerated domain names supported by CDN reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of CDN directory refreshes is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of CDN directory refreshes reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of CDN prefetch items is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of CDN prefetch items reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of EBS cloud disks is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of EBS cloud disks reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for vCPUs of ECS subscription instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to vCPUs of ECS subscription instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for vCPUs of ECS spot instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to vCPUs of ECS spot instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for vCPUs of ECS pay-as-you-go instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the vCPU quota of ECS pay-as-you-go instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of EIPs is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of EIPs reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of ESS scaling groups is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of ESS scaling groups reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
MSE related components are at capacity risk |
Ensure that resource capacity is within a reasonable range. If the capacity limit is exceeded, it may lead to stability risks. An MSE is considered non-compliant if it has a related metric whose capacity is exceeded. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of SNAT entries that can be retained in a NAT Gateway is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of SNAT entries that can be retained in a NAT Gateway reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of EIPs that can be bound to a NAT Gateway is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of EIPs that can be bound to a NAT Gateway reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of NLB instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of NLB instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for RDS on-demand instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to RDS on-demand instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of ROS stacks is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of ROS stacks reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of listeners retained by an SLB instance is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of listeners retained by an SLB instance reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the number of servers that can be mounted to the backend of an SLB instance is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the number of servers that can be mounted to the backend of an SLB instance reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of SLB instances is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of SLB instances reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of security groups is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of security groups reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of elastic network interfaces is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of elastic network interfaces (ENIs) reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Quota and capacity |
Quota usage for the total number of deployment sets is approaching the upper limit (New in Model 3.0) |
Insufficient resource quota may restrict the creation, change, or expansion operations of product resources. An account is considered non-compliant when the quota item related to the total number of deployment sets reaches 80% of its upper limit. |
Quick fix is not supported. |
No |
|
Monitoring management |
Monitoring alert rules are not set for cloud product resources |
Achieving full coverage of resource monitoring is the foundation and key to ensuring business continuity. Setting alert rules for cloud product resources is a necessary means to achieve monitoring of cloud product resources. An account is considered non-compliant if there are cloud product resources that are not covered by any alert rules. |
This fix automatically enables alert rules based on best practices for cloud resource types that are not configured with Cloud Monitor. By default, notifications are sent to the message recipients of the "Alibaba Cloud Account Alert Contact" type. Confirm the settings are correct. After enabling, you can view the enabled status or update the alert parameters in the one-click alerting feature of Cloud Monitor. |
No |
|
Monitoring management |
High-priority alert rules are not configured in ARMS |
Configuring effective alert rules can ensure that you are notified in a timely manner when the business system does not meet the expected operating conditions, so that you can make an emergency response in time. An account is considered non-compliant if no P1 level alert rules are configured for application monitoring or Prometheus monitoring in Alibaba Cloud ARMS, or if no corresponding notification policy is configured. |
Quick fix is not supported. |
No |
|
Monitoring management |
High-priority alerts in ARMS are not handled promptly |
The MTTx (Mean time to xx, such as MTTR: Mean Time to Recovery) metric can be used as an important measure of alert handling efficiency. Timely response to high-priority alerts can effectively improve the recovery efficiency of alerts and even faults, which improves the service quality of the business system. An account is considered non-compliant if no P1 level alert rules are configured for application monitoring or Prometheus monitoring, or if there are alerts in Alibaba Cloud ARMS that have not been resolved within 30 minutes (pending claim, in progress, or resolved after more than 30 minutes). |
Quick fix is not supported. |
No |
|
Monitoring management |
Alert rules with continuous alerts are not handled promptly |
Alert rules that are continuously in an alert state for a long time are a problem that needs attention and governance. Usually, the problem needs to be resolved as soon as possible to restore the monitoring metric to a normal level, or the alert rules need to be adjusted based on the actual situation to avoid many alert messages or alert fatigue from interfering with normal monitoring and O&M work. An account is considered non-compliant if any alert rule set in Cloud Monitor has been continuously in an alert state for more than 24 hours. |
Quick fix is not supported. |
No |
|
Monitoring management |
Prometheus monitoring is not configured for ACK cluster |
Connecting an ACK cluster to monitoring can help development and O&M personnel view the running status of the system, including the infrastructure layer, container performance layer, and more. An ACK cluster is considered non-compliant if "Enable Alibaba Cloud Prometheus Monitoring" is not configured. |
Quick fix is not supported. |
No |
|
Monitoring management |
Application monitoring is not configured for ACK cluster |
For distributed and microservice-based applications, you can connect to ARMS Application Monitoring for full-link tracing and code-level real-time performance monitoring to help O&M personnel keep track of application health at all times. An application is considered non-compliant if it is deployed in ACK or ECS but not connected to ARMS Application Monitoring. |
Quick fix is not supported. |
No |
|
Monitoring management |
Recommend unified monitoring of resources across Alibaba Cloud accounts for ARMS |
By creating a global aggregation instance, you can achieve unified monitoring across accounts. An account is considered non-compliant if it is not using ARMS and has not created a GlobalView instance. |
Quick fix is not supported. |
No |
Cost
|
Category |
Check item |
Check item description |
Quick fix description |
Supports assisted decision-making |
|
Cost monitoring |
Available credit alert is not enabled for the account (New in Model 3.0) |
If "Available Credit Alert" is not enabled for the account in the User Center, it may lead to risks such as service suspension due to overdue payments, data loss, or business interruptions when the account balance is exhausted. In addition, the lack of an alert mechanism may also cause cost overruns, affecting enterprise budget management and financial compliance. If "Available Credit Alert" is not enabled for the account in the User Center, it is considered non-compliant. |
This fix will enable the available credit alert feature. When your account's available credit is lower than the alert threshold, you will be notified by text message, email, and internal message to the account contact (up to 5 consecutive days of reminders). |
No |
|
Billing method optimization |
Recommend using subscription billing or adding pay-as-you-go resources to a savings plan for ECS instances |
We recommend using the subscription billing method for resources that are used stably for a long time. Normally, the cost of an ECS instance with subscription billing will be lower than that with pay-as-you-go billing. A savings plan is a discount plan that offers a lower pay-as-you-go discount in exchange for a commitment to use a stable amount of resources for a certain period of time. If an ECS instance uses a pay-as-you-go billing method and has not purchased a savings plan instance, it is not considered a best practice. |
Quick fix is not supported. |
No |
|
Billing method optimization |
Recommend using subscription billing for RDS instances |
We recommend using the subscription billing method for resources that are used stably for a long time. Normally, the cost of an RDS instance with subscription billing will be lower than that with pay-as-you-go billing. If an RDS instance uses a pay-as-you-go billing method, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
ESA site is in an abnormal state |
This check item ensures that the site is enabled to ensure that ESA can provide acceleration and protection for the site. If not enabled, it is not considered a best practice for application resource optimization. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Resource usage of ECS instance is low |
Maintaining the resource usage of ECS instances at a reasonable level for a long time is an important task for cloud cost management. The cloud platform provides enterprises with various specifications of ECS instances. Enterprises need to select instances of appropriate specifications based on the cyclical conditions of their actual business to achieve cost control of ECS instances. If the CPU usage and memory usage of an ECS instance are both below 3% for 30 consecutive days, it is not considered a best practice. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Resource usage of ECS disk is low |
Maintaining the resource usage of ECS instances at a reasonable level for a long time is an important task for cloud cost management. The cloud platform provides enterprises with various specifications of ECS instances. Enterprises need to select instances of appropriate specifications based on the cyclical conditions of their actual business to achieve cost control of ECS instances. If the usage of an ECS disk is below 3% for 30 consecutive days, it is not considered a best practice. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Resource usage of RDS instance is low |
Maintaining the resource usage of RDS instances at a reasonable level for a long time is an important task for cloud cost management. The cloud platform provides enterprises with various specifications of RDS instances. Enterprises need to select instances of appropriate specifications based on the cyclical conditions of their actual business to achieve cost control of RDS instances. If the CPU usage, memory usage, and disk usage of an RDS instance are all below 3% for 30 consecutive days, it is not considered a best practice. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Resource usage of RDS disk is low |
Maintaining the resource usage of RDS instances at a reasonable level for a long time is an important task for cloud cost management. The cloud platform provides enterprises with various specifications of RDS instances. Enterprises need to select instances of appropriate specifications based on the cyclical conditions of their actual business to achieve cost control of RDS instances. If the usage of an RDS disk is below 3% for 30 consecutive days, it is not considered a best practice. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle ALB instance exists |
If an ALB load balancer has a listener with no backend servers added, and the creation time is more than 7 days, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle ECS instance exists |
If an ECS instance is in a stopped state and the no-charge mode for stopped instances is not set, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle ECS disk exists |
If a cloud disk is not in use and was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle EIP instance exists |
If an EIP is not bound to a resource instance and was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle NAS file system instance exists |
If a NAS file system has no mount target added and was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle NAT Gateway exists |
If a NAT Gateway is not bound to an EIP or the bound EIP has no SNAT/DNAT entries set, and the gateway was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle SLB instance exists |
If an SLB load balancer has no running listener and was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle VPC NAT Gateway instance exists |
If a VPC NAT Gateway is not bound to an EIP, or the bound EIP has no SNAT/DNAT entries set, and the gateway was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle VPN Gateway exists |
If a VPN Gateway has no destination-based route policy configured or automatic BGP route propagation is not enabled, and the gateway was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle shared bandwidth instance exists |
If a shared bandwidth instance is not bound to a resource instance and was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Application resource optimization |
Idle container image instance exists |
If a container image instance has no namespace or image repository created, and the instance was created more than 7 days ago, it is considered non-compliant. |
Quick fix is not supported. |
No |
|
Cost policy |
Cost management suite is not enabled for ACK cluster |
Traditional methods lack effective cost insight and cost control measures for cloud-native scenarios. The cost management suite provides features such as resource waste inspection and resource cost prediction. If the cost management suite feature is not enabled for an ACK cluster, it is not considered a best practice. |
Quick fix is not supported. |
No |
Efficiency
|
Category |
Check item |
Check item description |
Quick fix description |
Supports assisted decision-making |
|
Resource Management |
Associated resources are divided into different resource groups |
If associated resources are not placed in a unified resource group, it prevents resource group-based permission, financial, and O&M management from covering all target resources. An account is considered non-compliant if there are associated resources that are not in the same custom resource group. |
Quick fix is not supported. |
No |
|
Resource Management |
Custom tags are not used to tag resources |
Through custom tags, users can more flexibly identify, sort, and organize various resources. An account is considered non-compliant if the proportion of resources with custom tags to the total resources is less than 75%. |
Quick fix is not supported. |
No |
|
Resource Management |
Custom resource groups are not used to group resources |
Through custom resource groups, you can more flexibly control the access and use of resources. An account is considered non-compliant if the proportion of resources belonging to custom resource groups to the total resources is less than 75%. |
Quick fix is not supported. |
No |
|
Resource Management |
Predefined tags are not used |
Predefined tags are tags that are created in advance and apply to all regions. Using predefined tags can facilitate the binding and management of cloud resources during the resource implementation phase. An account is considered non-compliant if the proportion of predefined tags to custom tags is less than 80%. |
Quick fix is not supported. |
No |
|
Resource Management |
Creator tags are not enabled |
When the scale of resources on the cloud continues to expand, multiple people need to manage the resources on the cloud. In scenarios such as cost and security, it is necessary to effectively identify the creators of resources to facilitate cost allocation or security traceability and improve management efficiency. An account is considered non-compliant if creator tags are not enabled. |
A creator tag is a system tag that Alibaba Cloud automatically generates and binds to the corresponding resource to identify the creator of the resource. Creator tags can help you analyze costs and bills and effectively manage your enterprise's cloud costs. This fix enables creator tags for the current account. |
No |
|
Resource Management |
Recommend enabling the multi-account resource search feature |
Using a resource directory to manage multiple Alibaba Cloud accounts, the management account or delegated administrator account can view and search for cloud resources of all members in the resource directory. An account is considered non-compliant if cross-account resource search is not enabled. |
Quick fix is not supported. |
No |
|
Account system |
Account is not managed by a resource directory |
Compared with decentralized management of multiple accounts, unified management of multiple accounts can bring value to the enterprise in terms of permissions, security, and cost. An account is considered non-compliant if it does not belong to any resource directory. |
Quick fix is not supported. |
No |
|
Account system |
Recommend centralized management of multi-account message contacts |
Through the resource directory message contact management feature, you can achieve centralized management of message contacts across accounts. An account is considered non-compliant if no resource directory message contact is detected or the message contact is not bound to a resource directory, resource folder, or member. |
Quick fix is not supported. |
No |
|
Account system |
Recommend setting a delegated administrator account for the resource directory where the account is located |
Using a delegated administrator account can separate organization management tasks from business management tasks. The management account performs organization management tasks for the resource directory, and the delegated administrator account performs business management tasks for trusted services. An account is considered non-compliant if a delegated administrator account is not set in the trusted services enabled by the resource directory management account (MA). |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated ACK API |
Deprecated ACK APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated ACK API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated ALB API |
Deprecated ALB APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated ALB API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated CDN API |
Deprecated CDN APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated CDN API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated CEN API |
Deprecated CEN APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated CEN API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated ECS API |
Deprecated ECS APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated ECS API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated NAS API |
Deprecated NAS APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated NAS API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated PolarDB API |
Deprecated PolarDB APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated PolarDB API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated RDS API |
Deprecated RDS APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated RDS API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated Redis API |
Deprecated Redis APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated Redis API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated RocketMQ API |
Deprecated RocketMQ APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated RocketMQ API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated SLB API |
Deprecated SLB APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated SLB API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
User calls a deprecated VPC API |
Deprecated VPC APIs are no longer maintained, have stability risks, and cannot use new features. An account is considered non-compliant if there has been a call to a deprecated VPC API in the last 30 days. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
Recommend using automated methods for continuous resource management |
An account is considered non-compliant if the ratio of the number of times OpenAPI is called to continuously manage resources using non-console methods in the last 30 days has not reached 100%. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
Recommend using automated methods for daily resource provisioning |
An account is considered non-compliant if the ratio of the number of times OpenAPI is called to create resources using non-console methods in the last year has not reached 100%. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
Recommend using automated methods to manage resources |
An account is considered non-compliant if the ratio of OpenAPI calls made using automated means such as SDK, Terraform, Cloud Control API, CADT, ROS, and Service Catalog in the last 30 days has not reached 100%. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
Call success rate of resource creation interface has not reached 100% |
An account is considered non-compliant if the success rate of creating infrastructure resources using automated means such as OpenAPI, Cloud Control API, SDK, or Terraform in the last 30 days has not reached 100%. |
Quick fix is not supported. |
No |
|
Resource provisioning and orchestration |
Call success rate of resource change interface has not reached 100% |
An account is considered non-compliant if the success rate of changing infrastructure resources using automated means such as OpenAPI, Cloud Control API, SDK, or Terraform in the last 30 days has not reached 100%. |
Quick fix is not supported. |
No |
Performance
|
Category |
Check item |
Check item description |
Quick fix description |
Supports assisted decision-making |
|
Performance monitoring |
EIP associated with ALB is at risk of high performance load (New in Model 3.0) |
When the outbound bandwidth usage of an EIP associated with an ALB instance is too high for a long time, it leads to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. An ALB instance is considered non-compliant if the maximum outbound bandwidth usage of any EIP associated with it is greater than or equal to 80% for at least 8 hours in the past 24 hours. |
Quick fix is not supported. |
No |
|
Performance monitoring |
Shared bandwidth associated with ALB is at risk of high performance load (New in Model 3.0) |
When the outbound bandwidth usage of a shared bandwidth instance associated with an ALB instance is too high for a long time, it leads to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. An ALB instance is considered non-compliant if the maximum outbound bandwidth usage of a shared bandwidth instance associated with it is greater than or equal to 80% for at least 8 hours in the past 24 hours. |
Quick fix is not supported. |
No |
|
Performance monitoring |
EBS cloud disk is at performance risk due to high throughput |
This helps customers prevent performance bottlenecks, evaluate whether storage resources are allocated reasonably, and whether expansion is needed to ensure business continuity. An EBS cloud disk is considered non-compliant if its IOPS or BPS usage in the past 24 hours exceeds 90% of the IOPS or BPS of the cloud disk type. |
Quick fix is not supported. |
No |
|
Performance monitoring |
EBS cloud disk is at performance risk due to high space usage |
Excessively high disk space usage may increase the risk of data loss. This helps customers discover potential performance bottlenecks early and take measures to avoid performance degradation. An EBS cloud disk is considered non-compliant if its space usage exceeds 80%. |
Quick fix is not supported. |
No |
|
Performance monitoring |
ECS instance is at performance risk due to high CPU usage |
Ensuring that the CPU usage of the core cloud product ECS is at a healthy level is the basis for ensuring stable business performance and continuous operation. High load not only causes application response to slow down, but may also trigger automatic protection mechanisms, such as system automatic restart or service degradation. An ECS instance is considered non-compliant if its CPU usage is too high, that is, the CPU usage is greater than 85% for a cumulative total of more than 8 hours in the past 24 hours. |
Quick fix is not supported. |
No |
|
Performance monitoring |
ECS instance is at performance risk due to high memory usage |
Ensuring that the memory usage of the core cloud product ECS is at a healthy level to avoid performance degradation or service interruption risks caused by insufficient memory. An ECS instance is considered non-compliant if its memory usage is too high, that is, the memory usage of ECS is greater than 85% for a cumulative total of more than 9 hours in the past 24 hours. |
Quick fix is not supported. |
No |
|
Performance monitoring |
RDS instance is at risk of high performance load (New in Model 3.0) |
When any one of the CPU, memory, and connection count of an RDS instance is too high for a long time, it leads to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. An RDS instance is considered non-compliant if the average usage of any one of the CPU, memory, connection count usage, or IOPS metrics is greater than or equal to 80% for at least 8 hours in the last 7 days. |
Quick fix is not supported. |
No |
|
Performance monitoring |
Redis instance is at risk of high performance load (New in Model 3.0) |
When the CPU usage or memory usage of a Redis instance is continuously high for any period of time, it may lead to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. A Redis instance is considered non-compliant if the average usage of either the CPU usage or memory usage is greater than or equal to 80% for a cumulative total of more than 8 hours in the last 7 days. |
Quick fix is not supported. |
No |
|
Performance monitoring |
SLB instance is at risk of high performance load (New in Model 3.0) |
When the usage of any one of the maximum connections, new connections, and outbound traffic over Internet of an SLB instance is too high for a long time, it leads to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. An SLB instance is considered non-compliant if the average usage of any one of the instance maximum connection usage, new connections, or outbound traffic over Internet usage metrics is greater than or equal to 80% for at least 8 hours in the past 7 days. |
Quick fix is not supported. |
No |
|
Performance monitoring |
VPN Gateway is at risk of high performance load (New in Model 3.0) |
When the inbound or outbound bandwidth usage of a VPN Gateway is too high for a long time, it leads to system performance degradation, stability reduction, and even service interruptions. It is recommended to pay attention to and handle it in a timely manner. A VPN Gateway is considered non-compliant if the maximum value of its inbound bandwidth usage or outbound bandwidth usage is greater than or equal to 80% for at least 8 hours in the past 24 hours. |
Quick fix is not supported. |
No |
|
Utilize elastic resources |
ECS scaling group is at risk of being unable to automatically scale for performance |
Core cloud products such as ECS resources can automatically increase or decrease resources based on performance load, which ensures a dynamic balance during business operation. |
Quick fix is not supported. |
No |
|
Utilize elastic resources |
Auto scaling is not enabled for RDS (New in Model 3.0) |
If the auto scaling feature is not enabled for an RDS instance, it may be unable to expand resources in time to cope with load growth during peak business hours, or unable to release idle resources during off-peak hours. This causes performance bottlenecks, response delays, and even service interruptions, while also causing resource waste and unnecessary cost expenditure. Enabling the auto scaling feature helps customers achieve elastic scheduling and efficient utilization of resources, which optimizes the cost structure while ensuring database stability and high availability, and improves the intelligence level of cloud resource management. An RDS instance is considered non-compliant if auto scaling is not enabled. |
This fix enables the automatic storage expansion feature for the RDS instance. After this is enabled, the storage space is automatically expanded when it reaches the threshold. The instance does not need to be restarted during the expansion, and there is no impact on the business. |
No |
|
Network design |
Cache rule is not configured for an ESA site |
This check item ensures that the site has a cache rule configured to reduce origin fetch traffic. If not configured, it is not considered a best practice for network optimization. |
Quick fix is not supported. |
No |
|
Network design |
Smart Routing is not enabled for an ESA site in a global region |
This check item ensures that the site has Smart Routing enabled to improve the acceleration effect of ESA in global regions. If not enabled, it is not considered a best practice for network optimization. |
This fix enables Smart Routing for the selected site. Smart Routing uses Alibaba Cloud's global edge nodes to detect network conditions in real time, selects optimal routes for data transmission, and applies protocol-stack optimizations to reduce latency and request failures. Smart Routing is billed by request count (Billing). |
No |
|
Network design |
CDN is not used to accelerate access to OSS resources (New in Model 3.0) |
Using CDN to distribute static resources such as images, videos, and documents in OSS can reduce traffic costs and improve resource loading speed. CDN deploys cache nodes in multiple regions around the world. When a user requests to access static resources in OSS, CDN routes the user's request to the nearest cache node, without the need for long-distance requests to directly access OSS resources. At the same time, the nearest CDN node returns the cached resources to the user, without having to fetch from the origin OSS. This process generates CDN downstream traffic costs. Compared with OSS outbound traffic over Internet, the unit price of CDN downstream traffic is lower. An OSS bucket is considered non-compliant if its public network inbound traffic exceeds 100 B within 24 hours, but CDN is not used to optimize OSS data transmission. |
Quick fix is not supported. |
No |
Removed check items
Some check items from Model 2.0 have been merged into the new Model 3.0. The new Model 3.0 covers the detection of related risks, so the following check items have been removed from the new Model 3.0.
|
Pillar |
Category |
Check item |
Check item description |
|
Security |
Prevent privilege abuse |
Too many RAM identities are granted high-risk permissions for OSS and SLS |
For permission management of RAM identities, we recommend following the principle of least privilege by granting only the necessary permissions. Mismanagement of high-risk permissions can lead to data loss or unauthorized data access. For example, RAM identities with permissions such as `oss:Delete*` or `log:Delete*` can delete data stored in OSS or SLS. RAM identities with permissions such as `oss:PutBucketAcl`, `oss:PutObjectAcl`, or `oss:PutBucketPolicy` can modify the access permissions of files in an OSS bucket, which can expose files to external access. An account with three or fewer RAM identities that have these high-risk permissions is compliant. |
|
Security |
Prevent privilege abuse |
Too many RAM identities are granted high-risk permissions for the resource directory |
Using a resource directory, you can achieve unified management of accounts within an organization, create new accounts based on the current organization, or remove existing accounts from the current organization. According to best practices, only administrators or cloud management team leaders within the organization should have write operation permissions for the resource directory, such as enabling or disabling the resource directory, creating, inviting, or deleting accounts, and switching account types. Normally, there should be no more than 3 people in an enterprise with this function. It is not recommended to grant corresponding write permissions for the resource directory to regular users. Otherwise, it may cause business damage due to misoperations such as deleting a cloud account. |
|
Security |
Use granular authorization |
RAM identity exists with a converged scope of access operations |
For permission management of RAM identities, we recommend following the principle of least privilege by granting only the necessary permissions. An account is compliant if a RAM identity is bound to some operation permissions of a cloud service. |
|
Security |
Use granular authorization |
No RAM identity exists with converged access to OSS and SLS |
For permission management of RAM identities, we recommend following the principle of least privilege by granting only the necessary permissions. Especially for access to data products, such as OSS and SLS, we recommend fine-grained authorization to reduce the risk of data breaches caused by identity compromise. If a RAM identity is bound to operation permissions related to data products, fine-grained authorization must be performed. Do not use the wildcard character * for batch authorization. This is compliant. |
|
Security |
Authorization efficiency and control |
The effective scope of a custom policy granted to a RAM identity does not specify a resource group |
By default, the effective scope when granting a custom policy to a RAM identity is at the account level. In this case, if the custom policy does not explicitly restrict specified resources or specify permission effective conditions, the RAM identity has the specified permissions for all resources under the account. According to cloud resource management best practices, resources should be grouped by resource group, and RAM identities should be authorized based on the groups. During the authorization process, by restricting the effective scope to a resource group, you can better limit the permission scope of the RAM identity and achieve fine-grained authorization. It is considered a best practice if the effective scope of a custom policy granted to a RAM identity is a resource group, or if a resource group is specified in the policy conditions. |
|
Security |
Authorization efficiency and control |
No RAM identity authorization with a service-level system policy is converged to a resource group |
For permission management of RAM identities, we recommend following the principle of least privilege by granting only the necessary permissions. By dividing cloud resources into resource groups based on dimensions such as application and environment, you can authorize based on resource groups during authorization, which further narrows the permission scope and avoids risks brought by excessive permissions. An account is compliant if a RAM identity with a service-level system policy such as AliyunECSFullAccess has an authorization scope of a resource group. |
|
Security |
Authorization efficiency and control |
No RAM identity authorization with Admin permissions is converged to a resource group |
For permission management of RAM identities, we recommend following the principle of least privilege by granting only the necessary permissions. By dividing cloud resources into resource groups based on dimensions such as application and environment, you can authorize based on resource groups during authorization, which further narrows the permission scope and avoids risks brought by excessive permissions. An account is compliant if a RAM identity with AdministratorAccess permissions has an authorization scope of a resource group. |
|
Security |
Non-compliant alert response |
Alert rules are not set for risky operation events |
An account is considered non-compliant if no rules related to account security or ActionTrail operation compliance supported in ActionTrail event alerts are enabled. |
|
Security |
Enable automatic correction |
Automated methods are not used to correct non-compliant issues |
An account is considered non-compliant if a user has not enabled automatic correction for any rule. |
|
Security |
Data storage instances should avoid public network access |
PolarDB instance IP whitelist is set to 0.0.0.0/0 |
An IP address whitelist is a list of IPs allowed to access a PolarDB cluster. If the IP address whitelist is set to % or 0.0.0.0/0, it means that any IP address is allowed to access the database cluster. This setting greatly reduces the security of the database and should not be used unless necessary. Best practices recommend following the principle of least privilege and setting an appropriate IP address whitelist to provide a high level of access security protection for the PolarDB cluster. It is not considered a best practice if the cluster IP address whitelist is set to 0.0.0.0/0 or %. |
|
Security |
Data storage instances should avoid public network access |
RDS instance IP whitelist is set to 0.0.0.0/0 |
An IP address whitelist is a list of IPs allowed to access an RDS instance. If the IP address whitelist is set to 0.0.0.0/0, it means that any IP address is allowed to access the database cluster. This setting greatly reduces the security of the database and should not be used unless necessary. Best practices recommend following the principle of least privilege and setting an appropriate IP address whitelist to provide a high level of access security protection for the database instance. It is not considered a best practice if the instance IP address whitelist is set to 0.0.0.0/0. |
|
Security |
Data storage instances should avoid public network access |
Redis instance IP whitelist is set to 0.0.0.0/0 |
An IP address whitelist is a list of IPs allowed to access a Redis instance. If the IP address whitelist is set to 0.0.0.0/0, it means that any IP address is allowed to access the database cluster. This setting greatly reduces the security of the database and should not be used unless necessary. Best practices recommend following the principle of least privilege and setting an appropriate IP address whitelist to provide a high level of access security protection for the database instance. It is not considered a best practice if the instance IP address whitelist is set to 0.0.0.0/0. |
|
Security |
Data storage instances should avoid public network access |
MongoDB instance IP whitelist is set to 0.0.0.0/0 |
An IP address whitelist is a list of IPs allowed to access a MongoDB instance. If the IP address whitelist is set to 0.0.0.0/0, it means that any IP address is allowed to access the database cluster. This setting greatly reduces the security of the database and should not be used unless necessary. Best practices recommend following the principle of least privilege and setting an appropriate IP address whitelist to provide a high level of access security protection for the database instance. It is not considered a best practice if the instance IP address whitelist is set to 0.0.0.0/0. |
|
Security |
Data storage instances should avoid public network access |
Elasticsearch instance IP whitelist is set to 0.0.0.0/0 |
An instance IP address whitelist is a list of IPs allowed to access an Elasticsearch instance. If the IP address whitelist is set to 0.0.0.0/0 or ::/0, it means that any IP address is allowed to access the instance. This setting greatly reduces the security of the instance and should not be used unless necessary. Best practices recommend following the principle of least privilege and setting an appropriate IP address whitelist to provide a high level of access security protection for the instance. It is not considered a best practice if the instance IP address whitelist is set to 0.0.0.0/0 or ::/0. |
|
Stability |
Deletion protection |
Release protection is not enabled for Redis resource |
A Redis instance is considered non-compliant if release protection is not enabled. |
|
Stability |
Deletion protection |
Release protection is not enabled for ECS resource |
An ECS instance is considered non-compliant if release protection is not enabled. |
|
Stability |
Change management |
Maintenance window for Redis resource is unreasonable |
A Redis instance is considered non-compliant if its automatic backup time period is not within the range of 04:00-05:00, 05:00-06:00, or 12:00-13:00. |
|
Stability |
Change management |
Maintenance window for PolarDB resource is unreasonable |
A PolarDB cluster is considered non-compliant if its maintenance window is not within the range of 02:00-04:00 or 06:00-10:00. |
|
Stability |
Change management |
Maintenance window for ADB resource is unreasonable |
An ADB cluster is considered non-compliant if its maintenance window is not within the range of 02:00-04:00, 06:00-08:00, or 12:00-13:00. |
|
Stability |
Change management |
Maintenance window for RDS resource is unreasonable |
An RDS instance is considered non-compliant if its maintenance window is not within the range of 02:00-06:00 or 06:00-10:00. |
|
Stability |
Change management |
Maintenance window for ECS resource is unreasonable |
Creating a snapshot for an ECS instance temporarily reduces the I/O performance of block storage. An automatic snapshot policy is considered non-compliant if the snapshot creation time is not within the range of 1 or 2. |
|
Cost |
Resource cost optimization |
"Best Practices for Idle Resource Detection" compliance package is not enabled |
An account is considered non-compliant if the idle resource detection compliance package is not enabled in Cloud Config. |
|
Efficiency |
Resource grouping and isolation |
Multiple accounts are not used to manage resources in the same organization |
An Alibaba Cloud account has multiple meanings. Each cloud account is a completely isolated tenant, and by default, resource access, network deployment, and identity permissions are completely independent and isolated. The cloud account is also associated with a bill, so different services can be deployed in different cloud accounts to achieve independent accounting and billing. Using multi-account management can bring benefits to the enterprise in terms of environment isolation, security compliance, and business innovation. The condition is met if there are two or more Alibaba Cloud accounts under the same entity. |
|
Efficiency |
Automation quality |
ECS quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |
|
Efficiency |
Automation quality |
VPC quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |
|
Efficiency |
Automation quality |
SLB quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |
|
Efficiency |
Automation quality |
CEN quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |
|
Efficiency |
Automation quality |
ACK quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |
|
Efficiency |
Automation quality |
CDN quota is at risk of saturation |
Product resource creation, change, or product feature use may encounter exceptions. A product is considered at risk if it has a resource quota item with a high quota level in the last 7 days and has encountered a quota_exceed error. |