Network Intelligence Service provides a rich library of cloud network diagnostics that cover stability, security, performance, cost optimization, and operational excellence. The network inspection feature offers observability into your cloud network architecture, helping you accurately detect anomalies and receive actionable optimization suggestions.
Use cases
When you build and maintain large-scale networks, you may not be familiar with all cloud product best practices, which can lead to suboptimal configurations. As your network grows, manually verifying the configuration and usage of every resource becomes impractical. The network inspection feature automates this process by examining your entire network architecture and providing actionable optimization suggestions.
Inspection items
|
Resource type |
Inspection category |
Inspection item |
Description |
Risk description |
Risk level |
Optimization suggestion |
|
EIP |
Network stability |
EIP bandwidth usage check |
Checks the bandwidth usage of an Elastic IP Address (EIP) during the inspection period. It counts the frequency of high bandwidth utilization or packet loss due to bandwidth limits. This helps you assess whether your current bandwidth meets business requirements and identify network risks caused by insufficient bandwidth. |
An alert for imminent public bandwidth overage was triggered during the last inspection period. |
Medium |
Increase the bandwidth of the EIP. For more information, see Modify the bandwidth of a subscription EIP or Modify the bandwidth of a pay-as-you-go EIP. |
|
A packet loss alert was triggered during the last inspection period because the public bandwidth limit was exceeded. |
High |
Increase the bandwidth of the EIP. For more information, see Modify the bandwidth of a subscription EIP or Modify the bandwidth of a pay-as-you-go EIP. |
||||
|
EIP running status check |
Checks whether any EIPs are in an abnormal state. |
The EIP is in a disabled or inactive state. |
Low |
The EIP is in a disabled or inactive state. Check whether the EIP instance is in a transitional or other abnormal state. |
||
|
Network cost optimization |
Idle EIP check |
Checks for idle EIPs. |
The EIP is not associated with any resource. |
Low |
This EIP is not associated with any resource but still incurs charges. To save costs, release the EIP instance if it is no longer needed. |
|
|
NAT |
Network stability |
NAT Gateway processing usage check |
Checks the processing usage of the NAT Gateway during the inspection period. It identifies overuse of concurrent connections, new connections, traffic processing rates, and SNAT source ports. This helps you assess whether the current resource configuration meets your business needs and identify network risks caused by insufficient capacity. |
The NAT Gateway dropped connections during the last inspection period because its session limit was exceeded. |
Medium |
Upgrade the NAT Gateway instance or change its billing method to pay-as-you-go. For more information, see the following topics: |
|
The NAT Gateway exceeded its new connection limit, triggering an alert for dropped connections during the last inspection period. |
High |
Reallocate the traffic that flows through the NAT Gateway instance or change its billing method to pay-as-you-go to increase traffic processing capacity. For more information, see the following topics: |
||||
|
Failed SNAT source port allocation triggered an alert during the last inspection period. |
High |
Add more EIPs to the SNAT IP address pool. For more information, see Internet NAT gateway. |
||||
|
CEN |
Network stability |
Inter-region bandwidth usage check |
Checks the bandwidth usage of the Cloud Enterprise Network (CEN) inter-region connection during the inspection period. It counts the frequency of high bandwidth utilization or packet loss due to bandwidth limits. This helps you assess whether the current resource bandwidth meets your business requirements and identify network risks caused by insufficient bandwidth. |
The inter-region connection exceeded its bandwidth limit, triggering a packet loss alert during the last inspection period. |
High |
Increase the inter-region connection bandwidth. |
|
Rate limiting in the traffic scheduling queue of the inter-region connection dropped packets. |
High |
Increase the inter-region connection bandwidth or adjust the traffic scheduling settings for the inter-region connection. |
||||
|
Transit router connection high availability check |
Checks for potential risks where insufficient high availability for network instances connected to a transit router (TR) can cause service interruptions during a failure. To ensure high availability, network best practices recommend that you configure redundant links for the transit router after network instances are connected to it. |
A VPC connects to a transit router using resources in only a single availability zone. A failure in this availability zone can cause service interruptions because traffic cannot fail over. |
High |
To ensure high availability, configure redundant links after connecting a VPC to a transit router. When creating a VPC connection, specify a vSwitch in each availability zone that is supported by the Enterprise Edition transit router. This provides availability zone-level disaster recovery for the VPC connection and reduces traffic detours. |
||
|
Transit router route configuration risk check |
Checks for risks in the current transit router routing configuration and provides optimization suggestions. |
The number of route entries in the route table of a Basic Edition transit router has reached 80% of its quota. If the quota is exceeded, you cannot add new routes to the route table, which may cause network disconnections. |
Medium |
Upgrade to an Enterprise Edition transit router. An Enterprise Edition transit router supports a quota of 40,000 route entries and provides a rich set of features, such as custom route tables and flow logs. |
||
|
VPC-to-TR connection route risk check |
Checks for route conflicts and risks when a VPC is connected to a transit router and provides configuration optimization suggestions. |
The private CIDR blocks of VPCs connected to the same CEN instance overlap, which may cause route conflicts in the CEN instance. |
Medium |
Plan your VPC CIDR blocks to ensure that VPCs and vSwitches attached to the same CEN instance use non-overlapping CIDR blocks. |
||
|
VPC connection bandwidth usage check |
Checks the bandwidth usage of the CEN VPC connection during the inspection period. It counts the frequency of packet loss due to bandwidth limits. This helps you assess whether the current resource bandwidth meets your business requirements and identify network risks caused by insufficient bandwidth. |
The VPC connection exceeded its bandwidth limit, triggering a packet loss alert during the last inspection period. |
High |
Enable the flow log feature for your VPC connection and use flow logs to analyze whether service traffic distribution meets expectations. |
||
|
VPN |
Network stability |
VPN Gateway usage limit check |
Checks the usage of the VPN Gateway service during the inspection period. It counts the frequency of bandwidth overages and BGP dynamic routing propagation limit overages. This helps you assess the health of the VPN Gateway service and identify network risks caused by insufficient resource configurations. |
The BGP dynamic route quota was exceeded, triggering a risk alert during the last inspection period. |
High |
Monitor the BGP route count. If the quota is exceeded, perform route aggregation on the peer VPN device based on your network plan. |
|
The VPN Gateway bandwidth limit was exceeded, triggering a risk alert during the last inspection period. |
Medium |
Check whether the bandwidth of the instance on this link meets your business requirements. If the bandwidth is insufficient, upgrade the VPN Gateway bandwidth or purchase a new instance to expand the bandwidth. Otherwise, you can ignore this alert. |
||||
|
VPN Gateway redundancy check |
Checks the VPN Gateway redundancy configuration. |
One of the two tunnels of the IPsec-VPN connection failed to establish, making the availability zone-level high availability feature ineffective. |
High |
Establish an IPsec-VPN connection between all tunnels of the instance and the peer device to restore availability zone-level high availability. For more information, see IPsec-VPN connection (associated with a VPN gateway). |
||
|
The VPN Gateway instance is deployed in a single availability zone, which poses a significant disaster recovery risk. |
High |
Enable dual-tunnel mode for the VPN Gateway instance and ensure both tunnels are active. |
||||
|
ALB |
Network stability |
Application Load Balancer VIP processing usage check |
Checks the load on the Application Load Balancer (ALB) virtual IP (VIP) address during the inspection period. This includes the load on sessions, connections, Queries Per Second (QPS), and bandwidth. This helps you assess whether the current resource configuration meets your business requirements and identify network risks caused by insufficient resources. |
The ALB instance exceeded its session limit, triggering an alert for dropped new connections during the last inspection period. |
High |
A single VIP address resolved from an ALB domain name has a new connection limit. Use ALB instances by configuring CNAME records instead of using VIP addresses. For more information, see Configure a CNAME record for an ALB instance. |
|
The ALB instance exceeded its QPS limit, triggering an alert during the last inspection period. |
High |
A single VIP address resolved from an ALB domain name has a QPS limit. We recommend that you use ALB instances by configuring CNAME records instead of using VIP addresses. For more information, see Configure a CNAME record for an ALB instance. |
||||
|
The ALB instance exceeded its private network bandwidth limit, triggering a packet loss alert during the last inspection period. |
High |
A single VIP address resolved from an ALB domain name has a bandwidth limit. We recommend that you use ALB instances by configuring CNAME records instead of using VIP addresses. For more information, see Configure a CNAME record for an ALB instance. |
||||
|
ALB high availability deployment check |
Checks whether the backend servers of an ALB listener are deployed in multiple availability zones to ensure the high availability of the listener service. |
The backend servers for an ALB listener's default server group are in a single availability zone. |
Medium |
The ALB listener's current architecture has an availability zone-level risk: the service will become unavailable if that zone fails. Deploy the backend servers for the listener and forwarding rules across at least two availability zones to reduce the blast radius of failures. If you need to migrate servers across availability zones, see the migration guide. |
||
|
NLB |
Network stability |
Network Load Balancer VIP processing usage check |
Checks the load on the Network Load Balancer (NLB) VIP during the inspection period. This includes the load on new connections and concurrent connections. This helps you assess whether the current resource configuration meets your business requirements and identify network risks caused by insufficient resources. |
A sudden increase in failed NLB connections triggered an alert during the last inspection period. |
High |
Possible causes:
|
|
An alert for dropped new NLB connections was triggered during the last inspection period. |
High |
Possible causes:
|
||||
|
The NLB instance exceeded its new connection limit, triggering an alert during the last inspection period. |
High |
The auto-scaling limit of a single NLB VIP has been exceeded, causing new connection requests to be continuously dropped. Distribute the workload across multiple NLB instances or contact your account manager to request a quota increase. |
||||
|
The NLB instance exceeded its concurrent connection limit, triggering an alert during the last inspection period. |
High |
The auto-scaling limit of a single NLB VIP has been exceeded, causing new connection requests to be continuously dropped. Distribute the workload across multiple NLB instances or contact your account manager to request a quota increase. |
||||
|
NLB high availability deployment check |
Checks whether the backend servers of an NLB listener are deployed in multiple availability zones to ensure the high availability of the listener service. |
The backend servers of an NLB listener are in a single availability zone. |
Medium |
The NLB listener's current architecture has an availability zone-level risk: the service will become unavailable if that zone fails. Deploy the backend servers for the listener across at least two availability zones to reduce the blast radius of failures. If you need to migrate servers across availability zones, see the migration guide. |
||
|
CLB |
Network stability |
Server Load Balancer processing usage check |
Checks the load on the Server Load Balancer (CLB) instance during the inspection period. This includes the load on sessions, connections, and bandwidth. This helps you assess whether the current resource configuration meets your business requirements and identify network risks caused by insufficient resources. |
The CLB instance exceeded its bandwidth limit, triggering a packet loss alert during the last inspection period. |
High |
Increase the bandwidth of the CLB instance. For more information, see Change the configurations of a pay-as-you-go CLB instance or Change the configurations of a subscription CLB instance. |
|
The CLB instance exceeded its session limit, triggering an alert for dropped new connections during the last inspection period. |
High |
Upgrade the CLB instance or migrate it to an ALB or NLB instance. For more information, see the following topics:
|
||||
|
A sudden increase in failed CLB connections triggered an alert during the last inspection period. |
High |
This issue is often caused by exceeded backend server specifications, high server load, or service exceptions. Check the status of your backend services. |
||||
|
VBR |
Network stability |
BGP connection status check |
Checks the running status of the BGP connection for the Express Connect circuit during the inspection period. It counts the frequency of connection port anomalies. This helps you monitor the quality of the provider's Express Connect circuit and promptly detect stability risks. |
A BGP connection failure occurred during the last inspection period. |
High |
Contact the connectivity provider to check the Express Connect circuit for anomalies. |
|
Express Connect circuit port check |
Checks the running status of the Express Connect circuit port during the inspection period. It counts the frequency of BGP connection anomalies. This helps you monitor the quality of the provider's Express Connect circuit and promptly detect stability risks. |
An alert for an Express Connect circuit port or link failure was triggered during the last inspection period. |
High |
Contact the connectivity provider to check the Express Connect circuit for anomalies. |
||
|
VBR static route health configuration check |
Checks whether a health check is configured for the Virtual Border Router (VBR) connection. |
A static route that points to a VBR is configured in CEN, but no corresponding health check is configured. |
High |
After connecting a VBR to a CEN instance, use the health check feature of CEN to monitor the connectivity of the associated Express Connect circuit. In scenarios with redundant routes between CEN and your on-premises data center, health checks automatically fail over to an available route if an Express Connect circuit fails. This ensures uninterrupted traffic. |
||
|
No health check is configured for the VBR uplink. |
High |
When you connect an on-premises data center to a VPC by using redundant Express Connect circuits, we recommend that you configure health checks on both the on-premises data center and Alibaba Cloud sides to monitor the connectivity of the circuits. If one circuit fails, traffic is automatically routed over the other healthy circuit. |
||||
|
VBR redundancy check |
Checks the completeness of the VBR redundancy configuration to identify stability risks in Express Connect scenarios. |
No redundant connections are configured between the VPC and the VBR. |
Low |
No redundant connections are configured between the VPC and the VBR. You can select a redundancy solution based on your business requirements. For more information, see Connect an on-premises data center to the cloud through a VBR. |
||
|
Redundant connections are not configured for some CIDR blocks between the VPC and the VBR. |
Low |
Check whether there is service traffic in the route CIDR blocks that lack redundancy. If so, configure redundant connections. You can select a redundancy solution based on your business requirements. For more information, see Connect an on-premises data center to the cloud through a VBR. |
||||
|
Redundant connections are not configured for some CIDR blocks between the transit router and the VBR. |
Low |
Check whether there is service traffic in the route CIDR blocks that lack redundancy. If so, configure redundant connections. You can select a redundancy solution based on your business requirements. For more information, see Connect an on-premises data center to the cloud by using an Express Connect Router (ECR). |
||||
|
No redundant connections are configured between the transit router and the VBR. |
Low |
No redundant connections are configured between the transit router and the VBR. You can select a redundancy solution based on your business requirements. For more information, see Connect an on-premises data center to the cloud by using an Express Connect Router (ECR). |
||||
|
PrivateLink |
Network stability |
PrivateLink endpoint high availability deployment check |
PrivateLink provides secure, stable, and private access to services in other VPCs from your VPCs and on-premises data centers. Note
This check inspects only the availability zone-level high availability risk of the connection between the endpoint and the endpoint service. It cannot determine the availability zone-level risk of the service that is accessed through the endpoint. |
An interface endpoint instance exists in a single availability zone. |
High |
Add a new availability zone to the interface endpoint instance to ensure multi-zone disaster recovery. For more information, see Create and manage endpoint ENIs. Note
An interface endpoint instance that includes one endpoint ENI in one availability zone is a billable instance. Adding an availability zone increases costs. |
|
PrivateLink endpoint service high availability deployment check |
An endpoint service instance exists in a single availability zone. |
High |
Add service resources to the endpoint service so that it provides services in multiple availability zones. |
View network inspection reports
-
By default, Network Intelligence Service provides a free basic network inspection task. This task performs a comprehensive network inspection once a week and generates a report. You cannot create custom network inspection tasks with this basic service.
-
Network inspection reports are retained for one year.
Log on to the NIS console.
-
In the left-side navigation pane, click Network Inspection.
-
On the Network Inspection page, find the default network inspection task and perform the following operations:
-
View the details of the latest inspection report
-
In the Newest Inspection Report column, click View report for optimization suggestions.
-
The report details page shows the Basic Information, Inspection Summary, and Inspection Details of the report.
The Inspection Details section shows the abnormal results, optimization suggestions, and affected resources.
-
-
View the details of historical reports
-
In the Newest Inspection Report column, click View history.
-
On the Historical Reports page, find the target inspection report in the Historical Inspection Reports section. Then, click the report ID or click View Report in the Actions column.
-
The report details page shows the Basic Information, Inspection Summary, and Inspection Details of the report.
The Inspection Details section shows the abnormal results, optimization suggestions, and affected resources.
-
-
Manage network inspection tasks
-
Rerun a network inspection task
If your resources have changed, you can rerun the network inspection task to check the current status of your resources. Before you start, make sure the inspection task is in the Enabled state.
On the Network Inspection page, find the target network inspection task. In the Newest Inspection Report column, click View report for optimization suggestions. On the report details page, click Re-start Inspection in the upper-right corner.
-
Disable or enable a network inspection task
On the Network Inspection page, find the default network inspection task and click Stop Inspection or Start Inspection in the Actions column.
-
Delete a network inspection task
You must disable an inspection task before you can delete it. Deleting a network inspection task also deletes all of its reports.
On the Network Inspection page, find the default network inspection task and click Delete in the Actions column.