Lens extends the Well-Architected framework to specific industry and technology domains. Select a specialized check model to assess your cloud resources against domain-specific requirements, identify risks, and get targeted optimization recommendations.
Supported lenses
The following lenses are supported:
-
Container Build
Checks container protection across deployment, monitoring, and operational risks to ensure security and reliability baselines.
-
Machine Learning
Checks infrastructure architecture for AI model training, including whether core resources (ECS, NAS, OSS) match training requirements.
-
Network Services
Inspects network resource health, including capacity levels, disaster recovery architecture, and idle resources across multiple network products.
NoteBefore using the Network Services lens, enable Network Intelligence Service.
-
Data protection
Checks audit log retention, sensitive data protection, SQL performance anomalies, and disaster recovery and security controls.
NoteTo use this lens, you must enable Database Autonomy Service (DAS) and Data Transmission Service (DTS).
Supported check items
The following table lists check items for each lens.
|
Lens |
Check item |
Description |
|
Container Build |
ACK cluster deployed in a single availability zone |
Regional ACK clusters provide high availability by distributing nodes across multiple availability zones. A cluster is compliant if its nodes are in three or more availability zones. |
|
Container Build |
Cost management suite not enabled for an ACK cluster |
The cost management suite provides resource waste detection and cost prediction. An ACK cluster is non-compliant if the cost management suite is not enabled. |
|
Container Build |
ACK cluster not using a stable version |
If an ACK cluster is not upgraded to the latest version, the evaluation result is Non-compliant. |
|
Container Build |
Deletion protection not enabled for an ACK cluster |
If an ACK cluster does not have deletion protection enabled, the evaluation result is Non-compliant. |
|
Container Build |
Secret at-rest encryption not configured for an ACK cluster |
Secret at-rest encryption uses a key from Key Management Service (KMS) to encrypt Kubernetes Secrets, which enhances the security of sensitive information. An ACK Pro cluster is non-compliant if it does not use KMS for Secret at-rest encryption. |
|
Container Build |
ack-ram-authenticator not used for RAM authentication |
The ack-ram-authenticator component authenticates API server requests through RAM using Kubernetes Webhook Token authentication. In SSO role mapping scenarios, it enables secure auditing when different users assume the same role. An ACK cluster is non-compliant if ack-ram-authenticator is not enabled. |
|
Container Build |
Policy governance not used to restrict privileged container configurations |
Policy governance helps enterprise security operations teams better apply container security policies. An ACK cluster is non-compliant if no policy management is enabled. |
|
Container Build |
RRSA for pod-level permission isolation not implemented |
RRSA implements pod-level OpenAPI permission isolation, enabling fine-grained cloud resource access control. An ACK cluster is non-compliant if RRSA is not enabled. |
|
Container Build |
API Server audit logging not enabled |
API Server audit logs track operations performed by different users, supporting cluster security and O&M. An ACK cluster is non-compliant if API Server audit logging is not enabled. |
|
Container Build |
Control plane component logging not enabled |
Control plane component logs are collected to Simple Log Service for auditing and troubleshooting. An ACK cluster is non-compliant if control plane component logging is not enabled. |
|
Container Build |
Container Intelligence Service (CIS) cluster configuration inspection not enabled |
Container Intelligence Service (CIS) discovers potential cluster risks such as resource quota margins and key resource usage, and provides recommended fixes. An ACK cluster is non-compliant if CIS cluster configuration inspection is not enabled. |
|
Container Build |
Cluster security configuration inspection not enabled |
Configuration inspection scans for security vulnerabilities in workload configurations and generates reports. An ACK cluster is non-compliant if security configuration inspection is not enabled. |
|
Container Build |
Container internal operation audit logging not enabled |
Container auditing records commands and operations performed by different users inside containers. An ACK cluster is non-compliant if internal operation audit logging is not enabled. |
|
Container Build |
ACK cluster not using managed node pools |
Managed node pools automate node maintenance including high-risk CVE repairs and fault recovery. An ACK cluster is non-compliant if managed node pools are not used. |
|
Container Build |
Auto Scaling not enabled for a node pool |
Auto Scaling provisions pay-as-you-go instances on demand to elastically adjust computing resources. An ACK cluster is non-compliant if node pool Auto Scaling is not enabled. |
|
Container Service for Kubernetes (ACK) |
ACK managed cluster is a Basic edition |
ACK managed clusters are available in Basic and Pro editions. Compared to the Basic edition, the Pro edition provides enhanced reliability, security, and scheduling, making it more suitable for large-scale production workloads. A managed cluster is non-compliant if it is not a Pro edition. |
|
Container Service for Kubernetes (ACK) |
Zero backend servers for the CoreDNS service |
If an ACK cluster has zero backend servers for CoreDNS, service discovery fails completely. This interrupts intra-cluster communication (such as microservice calls and database access) and prevents applications from resolving addresses by service name, directly affecting service availability and cluster stability. An ACK cluster is non-compliant if it has zero backend servers for the CoreDNS service. |
|
Container Service for Kubernetes (ACK) |
Abnormal backend status for the API Server's CLB instance |
An abnormal backend status for the API Server's Classic Load Balancer (CLB) instance can interrupt control plane communication and disable cluster management. This prevents clients like |
|
Container Service for Kubernetes (ACK) |
Abnormal listener port configuration for the API Server's CLB instance |
An abnormal listener port configuration for the CLB instance bound to the API Server will disrupt API service access, preventing clients like |
|
Container Service for Kubernetes (ACK) |
The CLB instance bound to the API Server does not exist |
If an ACK cluster's API Server is not bound to a CLB instance, it lacks a traffic entry point. External clients like |
|
Container Service for Kubernetes (ACK) |
Abnormal status for the CLB instance bound to the API Server |
An abnormal status for the CLB instance bound to the API Server will cause API service traffic forwarding to fail, preventing clients like |
|
Container Service for Kubernetes (ACK) |
Node Kubelet component version is older than the control plane version |
If a node's Kubelet version is older than the control plane version, compatibility issues can arise. The control plane (e.g., API Server) may fail to communicate with the outdated Kubelet after a feature or protocol upgrade, leading to abnormal node status, pod scheduling failures, or nodes being marked as unavailable. An ACK cluster is non-compliant if a node's Kubelet version is older than the control plane version. |
|
Container Service for Kubernetes (ACK) |
Unavailable node pool scaling configuration |
An unavailable node pool scaling configuration prevents the cluster from automatically adjusting its node count. During high-load periods, the inability to scale out can lead to resource exhaustion, pod scheduling failures, or service interruptions. An ACK cluster is non-compliant if a node pool's scaling configuration is unavailable. |
|
Container Service for Kubernetes (ACK) |
Unavailable node pool scaling group |
An unavailable node pool scaling group disables the cluster's Auto Scaling capabilities. During high-load periods, this can lead to resource depletion, pod scheduling failures, or increased service latency. An ACK cluster is non-compliant if a node pool's scaling group is unavailable. |
|
Container Service for Kubernetes (ACK) |
Unavailable node pool security group |
An unavailable security group for a node pool will cause network access rules to fail. Communication between cluster components, such as between Kubelet and the API Server or for service discovery between pods, may be interrupted due to blocked ports or missing rules. An ACK cluster is non-compliant if a node pool's security group is unavailable. |
|
Container Service for Kubernetes (ACK) |
Unavailable node pool vSwitch |
An unavailable vSwitch for a node pool will interrupt network communication between nodes, preventing pods and services from interacting across nodes. This can cause service discovery failures or data transmission stalls. An ACK cluster is non-compliant if a node pool's vSwitch is unavailable. |
|
Container Service for Kubernetes (ACK) |
Unavailable APIService |
An unavailable APIService will cause extended API functions to fail. Custom resources (such as CRDs) will be unable to communicate with the control plane, leading to management anomalies in components that rely on extended APIs, such as operators and service meshes. An ACK cluster is non-compliant if an APIService is unavailable. |
|
Container Service for Kubernetes (ACK) |
Abnormal CoreDNS pods |
Abnormal CoreDNS pods in an ACK cluster can lead to unstable DNS resolution. Communication between services that use domain names may time out or fail, causing application call interruptions. An ACK cluster is non-compliant if it has abnormal CoreDNS pods. |
|
Container Service for Kubernetes (ACK) |
Abnormal status for an elasticity component |
An abnormal status in a cluster's elasticity components can cause auto-scaling and self-healing mechanisms to fail. This can lead to resource bottlenecks, service latency, or interruptions during high-load periods. An ACK cluster is non-compliant if an elasticity component has an abnormal status. |
|
Container Service for Kubernetes (ACK) |
Inconsistent billing method between a LoadBalancer Service and its instance |
A mismatch between the billing method of a LoadBalancer Service and its actual instance can lead to billing anomalies, such as unexpected pay-as-you-go charges for a subscription resource, or unexpected resource releases. An ACK cluster is non-compliant if such an inconsistency exists. |
|
Container Service for Kubernetes (ACK) |
Inconsistent certificate instance ID between a LoadBalancer Service and its instance |
A mismatch between the certificate instance ID of a LoadBalancer Service and the actual bound certificate will cause the TLS configuration to fail. This can lead to connection rejections or security warnings for HTTPS services, interrupting user access. An ACK cluster is non-compliant if such an inconsistency exists. |
|
Container Service for Kubernetes (ACK) |
Only one CoreDNS replica |
Running only a single replica of CoreDNS eliminates high availability. If the pod fails, the DNS service will be completely interrupted, causing DNS resolution failures and blocking communication between applications. An ACK cluster is non-compliant if it has only one CoreDNS replica. |
|
Machine Learning |
ECS instances not prohibited from binding public addresses exist |
ECS instances should not be directly exposed to the public network. Use NAT Gateway or SLB for public access instead. An ECS instance is non-compliant if it has a public IP address bound. |
|
Machine Learning |
Security group inbound rules set to 0.0.0.0/0 and any port exist |
Inbound rules allowing all IPs (0.0.0.0/0) on any port are prohibited. A security group is non-compliant if its inbound rules include 0.0.0.0/0 without restricting to specific ports. |
|
Machine Learning |
Security groups with high-risk ports (22/3389/...) open to the public network exist |
Public access to high-risk ports such as SSH (22) and RDP (3389) is prohibited. A security group is non-compliant if these ports are open to the public network. |
|
Machine Learning |
ACK clusters not using stable versions exist |
If an ACK cluster is not upgraded to the latest version, the evaluation result is Non-compliant. |
|
Machine Learning |
OSS resources not using multi-zone architecture exist |
If an OSS bucket does not have zone-redundant storage enabled, the evaluation result is Non-compliant. |
|
Machine Learning |
ECS resources without release protection enabled exist |
If an ECS instance does not have release protection enabled, the evaluation result is Non-compliant. |
|
Machine Learning |
OSS buckets without versioning enabled exist |
If an OSS instance does not have versioning enabled, data cannot be recovered when it is overwritten or deleted. If an OSS instance does not have versioning enabled, the evaluation result is Non-compliant. |
|
Machine Learning |
NAS file systems without backup plans created exist |
Use Cloud Backup to regularly back up all directories and files in your General-purpose NAS file system. A NAS file system is compliant if a backup plan is created. |
|
Machine Learning |
ACK clusters without Secret disk encryption configured exist |
Secret at-rest encryption uses a key from Key Management Service (KMS) to encrypt Kubernetes Secrets, which enhances the security of sensitive information. An ACK Pro cluster is non-compliant if it does not use KMS for Secret at-rest encryption. |
|
Machine Learning |
VPCs without flow logs enabled exist |
VPC flow logs record inbound and outbound traffic of ENIs for access control verification, traffic monitoring, and troubleshooting. A VPC is compliant if flow logging is enabled. |
|
Machine Learning |
OSS buckets without server-side encryption enabled exist |
OSS server-side encryption protects data at rest for high security or compliance requirements. An OSS bucket is compliant if KMS or OSS-managed encryption is enabled. |
|
Machine Learning |
VPC custom CIDR blocks without routes configured exist |
You can create custom route tables in a VPC, add custom route entries, and then bind the route table to a vSwitch to control its traffic for more flexible network management. A VPC custom CIDR block is compliant if at least one route entry exists for an IP address within that CIDR block in the associated route table. |
|
Machine Learning |
ECS instances using images that are not regularly updated and hardened exist |
Regularly updated images ensure systems include the latest security patches and perform optimally. An ECS instance is compliant if its image was created within the specified number of days (default: 180). |
|
Machine Learning |
OSS buckets without secure access configured in permission policies exist |
HTTPS provides higher security than HTTP. An OSS bucket is compliant if its bucket policy allows read and write access over HTTPS and denies HTTP access. Buckets without a bucket policy are Not Applicable. |
|
Machine Learning |
NAS file system access points without RAM policies enabled exist |
RAM policies for NAS access points grant mount, read, and write permissions to different RAM users or roles, enabling fine-grained permission management. A NAS file system is compliant if RAM policies are enabled for its access points. |
|
Machine Learning |
ECS instances without instance RAM roles assigned exist |
Instance RAM roles provide STS temporary credentials from within ECS instances, eliminating the need to embed AccessKey pairs. This improves security and enables fine-grained access control. An ECS instance is compliant if a RAM role is assigned. |
|
Machine Learning |
Running ECS instances without CloudMonitor agents installed exist |
CloudMonitor agents collect OS-level metrics and enable real-time monitoring with alert rules. A running ECS instance is compliant if the CloudMonitor agent is installed and running. Non-running instances are not applicable. |
|
Machine Learning |
NAS file systems without encryption configured exist |
Server-side encryption protects data at rest in NAS file systems and automatically decrypts data on access. A NAS file system is compliant if server-side encryption is enabled. |
|
Machine Learning |
Running ECS instances without Security Center protection enabled exist |
Security Center agents provide asset information collection, risk discovery, intrusion detection, and compliance baseline checks to protect ECS instances. A running ECS instance is compliant if a Security Center agent is installed. Non-running instances are not applicable. |
|
Machine Learning |
OSS buckets without logging enabled exist |
OSS generates hourly access logs with predefined naming conventions and stores them in a specified bucket for analysis. An OSS bucket is compliant if logging is enabled. |
|
Machine Learning |
ACK versions that are not maintained are being used |
Kubernetes releases minor versions approximately every 4 months. Clusters running outdated versions miss latest features, bug fixes, and security patches. An ACK cluster is compliant if its Kubernetes version is still supported. |
|
Machine Learning |
ACK clusters should not have public endpoints configured |
Public API server endpoints increase attack surface and may violate compliance requirements. An ACK cluster is compliant if no public endpoint is configured for its API server. |
|
Network Services |
Idle EIP resources exist |
If an EIP is not bound to a resource instance and has been created for more than 7 days, the evaluation result is Non-compliant. |
|
Network Services |
VPN instances not using multi-zone architecture exist |
For existing single-tunnel VPN gateway instances, we strongly recommend enabling multi-AZ high availability and configuring dual tunnels for the connection. A VPN instance is non-compliant if it uses a single-tunnel configuration. |
|
Network Services |
NLB instances not using multi-zone architecture exist |
For Network Load Balancer instances, we strongly recommend configuring multiple zones to meet multi-zone disaster recovery requirements. If a Network Load Balancer instance uses a single zone, the evaluation result is Non-compliant. |
|
Network Services |
EIP resources with abnormal running status exist |
Check whether EIPs run as expected. If an EIP is in a disabled or inactive state, the evaluation result is Non-compliant. |
|
Network Services |
NAT gateways with abnormal processing levels exist |
This check inspects the processing level of NAT gateways, including concurrent connections, new connection rate, and traffic throughput, to identify network risks. A NAT gateway is non-compliant if alerts for "NAT session limit exceeded," "NAT new session limit exceeded," or "SNAT source port allocation failure" were triggered during the last inspection period, or if the traffic processing rate is too high. |
|
Network Services |
VPN services with abnormal load levels exist |
Checks VPN gateway loads, bandwidth usage risks, and BGP route advertisement overage frequency. A VPN instance is non-compliant if SSL connection count is too high, client network segment addresses are insufficient, BGP route count exceeds limits, or bandwidth exceeds limits during the last inspection interval. |
|
Network Services |
ALB virtual IPs with abnormal processing levels exist |
Checks ALB VIP loads including sessions, connections, QPS, and bandwidth. An ALB instance is non-compliant if session limit, connection failure surge, QPS limit, or bandwidth limit alerts were triggered during the last inspection interval. |
|
Network Services |
NLB virtual IPs with abnormal processing levels exist |
Checks NLB VIP loads including new and concurrent connections. An NLB instance is non-compliant if failed connection surge, new connection drop, new connection limit exceeded, or concurrent connection limit exceeded alerts were triggered during the last inspection interval. |
|
Network Services |
VBR resources with abnormal BGP connection status exist |
Check the status of BGP connections created over Express Connect circuits and the frequency of Express Connect circuit failures within an inspection cycle. This helps you monitor the quality of leased lines and identify stability risks at the earliest opportunity. If BGP connection failure was triggered during the most recent inspection interval, the evaluation result is Non-compliant. |
|
Network Services |
CLB instances with abnormal processing levels exist |
Checks CLB instance loads including sessions, connections, and bandwidth. A CLB instance is non-compliant if bandwidth limit packet loss, session limit connection loss, or connection failure surge alerts were triggered during the last inspection interval. |
|
Network Services |
TR route configuration risks exist |
The number of routes in the route table of the Basic Edition transit router has reached 80% of the quota limit. When the quota limit is reached, routes can no longer be added to the route table, which may lead to network failures. The Basic Edition TR route quota has reached 80%. |
|
Network Services |
VBRs without health checks configured exist |
A static route is configured for the VBR to point to on-premises resources, but health check is not configured. If Express Connect circuits fail, automatic switching cannot be performed. If CEN or VBR upstream does not have health checks configured or VBR upstream does not have health checks configured, the evaluation result is Non-compliant. |
|
Network Services |
VBRs with missing redundancy exist |
This check inspects the integrity of VBR redundancy to identify stability risks. A VBR is non-compliant if redundant connections are not configured for some or all network segments between the VPC and VBR, or between the Transit Router (TR) on Cloud Enterprise Network (CEN) and the VBR. |
|
Network Services |
Physical Express Connect circuits with port abnormalities exist |
Check the status of Express Connect circuits and the frequency of BGP connection failures within an inspection cycle. This helps you monitor the quality of leased lines and identify stability risks at the earliest opportunity. If Express Connect circuit port or link failure alert was triggered during the most recent inspection interval, the evaluation result is Non-compliant. |
|
Network Services |
EIPs with abnormal bandwidth levels exist |
Checks EIP bandwidth usage and packet loss frequency. An EIP is non-compliant if bandwidth limit warnings or packet loss alerts were triggered during the last inspection interval. |
|
Network Services |
Cross-region bandwidth with abnormal levels exists |
Checks CEN inter-region bandwidth usage and packet loss frequency. A cross-region connection is non-compliant if bandwidth exceeded limit packet loss alerts were triggered or traffic scheduling queues exceeded bandwidth limits during the last inspection interval. |
|
Data Protection |
SQL audit logging not enabled for high-spec database instances |
Audit logs provide a comprehensive record of database operations, which can be used for diagnosing operational failures and meeting regulatory compliance requirements. This is considered a risk if SQL audit logging is not enabled for a high-specification database instance (defined as an instance with ≥ 4 vCPU/8 GiB, or an instance belonging to an account in the finance industry). |
|
Data Protection |
Security auditing not enabled for a high-spec database instance |
Security auditing detects risks such as data exfiltration, SQL injection, and abnormal access to protect data assets. This is considered a risk if security auditing is not enabled for a high-specification database instance (defined as an instance with ≥ 4 vCPU/8 GiB, or with total SQL logs > 100 GB/day). |
|
Data Protection |
Unified security collaboration not enabled for a high-spec database instance |
Enabling security collaboration helps prevent non-standard operations during database changes from affecting stability. This check applies to high-specification instances (≥ 4 vCPU/8 GiB). |
|
Data Protection |
Cross-region disaster recovery not enabled for a database instance |
Databases may become unavailable during a regional outage or failure. To check if disaster recovery is enabled, verify whether Data Transmission Service (DTS) synchronization is configured for the database instance in the DTS console. |
|
Data Protection |
Excessive slow SQL queries on a database instance |
Analyzing slow SQL queries is an effective method for identifying database performance issues. Slow SQL queries can consume excessive CPU, I/O, or execution time, and can also lock resources needed by other queries, potentially causing service instability. An instance is considered at risk if it has more than 100 slow SQL queries within the last 24 hours. Instances with no account password or with a QPS below 50 are not applicable. |
|
Data Protection |
Sensitive data protection not enabled for a high-spec database instance |
Enabling sensitive data protection helps implement dynamic security for sensitive data, reducing the risk of data leaks and non-compliance. |
|
Data Protection |
Cross-region backup not enabled for a database instance |
Databases may face the risk of data loss during a regional outage or failure. Verify the status of the cross-region backup feature on the Backup and Restoration page of the instance details in the database console. |
Lens check results
Agentic Cloud Governance Center runs daily checks across all lenses. View results and follow the remediation guidance to address risks.
-
Log in to the Agentic Cloud Governance Center console.
-
In the left-side navigation pane, choose .
-
In the top navigation bar, switch to any lens to view its check results.
The following example uses the Machine Learning lens.
The results page displays a summary of total check items, high-risk items, medium-risk items, and recommendations, along with a donut chart. You can switch to the standard view, filter by category (Security, Stability, or All), and add filter tags such as risk status. The main section lists check items with their name, scenario, risk level, affected resources, compliance rate, and actions.
NoteClick Re-detect to manually obtain new check data for the Lens.
-
Click a risk item to view check details and remediation guidance in the detection details panel.