We recommend enabling ALB access logs to quickly troubleshoot HTTP errors from ALB. First, compare the ALB status code (in the status field) with the backend status code (in the upstream_status field) in the access log. If the values are the same, ALB likely passed through the status code from the backend server. In this case, you should prioritize troubleshooting the backend service.
Five quick checks
Before troubleshooting by status code, perform these five checks first. They address the most common root causes of ALB failures.
-
Run the ALB instance diagnostic tool. On the Instances page, find the target instance. In the Instance Diagnostics column, click Diagnose to run a one-click check for common issues with instance configurations, listeners, and backend services.
-
Verify that the Health Check Status for the listener is Healthy. On the Listener Details page, check the Health Check Status of the backend server group. If the status is Unhealthy, see Troubleshoot ALB health check failures.
-
Ensure the backend ECS does not block the CIDR block of the VSwitch where the ALB instance resides. ALB communicates with backend servers using a Local IP assigned by its VSwitch. If iptables rules or third-party security software on a backend ECS instance blocks the VSwitch's CIDR block, ALB cannot reach the backend, which can trigger errors such as 502 or 504.
-
Confirm that the port configured for a backend server in the server group matches the port that the backend service actually listens on. The port configured for each backend server in an ALB server group must match the port that the application process listens on. For example, if the port is configured as
8080in the server group but the backend service is listening on port80, the connection will fail. You can runss -tlnp | grep ':<port> 'ornetstat -tlnp | grep ':<port> 'on the backend ECS to verify. -
For HTTPS listeners, verify that the certificate has not expired and that its domain matches the access domain. On the Listener Details page, check the bound certificate and its expiration date. An expired certificate or a domain mismatch can cause SSL handshake failures or return an error status code.
502 Bad Gateway
This error occurs when an HTTP or HTTPS listener receives a client request, but ALB fails to forward the request to a backend server or receive a response from it.
Troubleshooting approach: First, check the value of the upstream_status field in the access log to determine the next steps.
-
If
upstream_status= 502: ALB passed through the 502 status code from the backend server. The issue lies with the backend service itself. Investigate your backend service. For example, check whether a backend Nginx or gateway layer is attempting to reverse proxy to an unreachable upstream. -
If
upstream_statusis another value (such as504,444, or500): Thestatusthat ALB returns to the client differs fromupstream_status, which means ALB changed the status code. Investigate why the backend service is returning that specific status code by checking the backend Nginx, gateway, or application logs. -
If
upstream_statusis-or empty: ALB did not receive any response from the backend. This means the request either never reached the backend, or the backend connection was abnormally terminated before a response was sent. Check the following causes in order:-
TCP communication between ALB and the backend server is failing. Verify that the backend service is running, the service port is listening correctly, and no iptables rules or third-party security software on the backend ECS are blocking the CIDR block of the VSwitch where the ALB instance is located. ALB communicates with backend servers by using a Local IP assigned by the VSwitch. You can capture packets to check if the TCP handshake is successful.
-
The backend server's backlog is full. This causes the server to drop new connection requests. Run
netstat -s | grep -i listenon the backend server and check for adropcounter. -
The backend server failed to process the request in time. Check the backend server's logs and review CPU and memory usage to identify any performance bottlenecks.
-
The packet size of the client request exceeds the MTU of the backend server. This can cause short packets (such as health checks) to succeed while long packets fail. Capture packets on the backend server to analyze whether the packet length is within the required limits.
-
The backend server's response has an invalid format or contains invalid HTTP headers. Capture packets on the backend server to analyze if the response format is standard-compliant.
-
400 Bad Request
The request format is invalid.
-
The backend returns 400 directly: Check the access log. If
upstream_statusis400, ALB likely passed through the status code from the backend. Investigate the backend service. -
An HTTP request is sent to an HTTPS listener: An ALB HTTPS listener rejects non-HTTPS requests and returns a
400status code. Check if the client is incorrectly sending an HTTP request to an HTTPS port. -
The request header size exceeds the limit: ALB requires each HTTP request header to be no larger than 32 KB. If this limit is exceeded, ALB returns a
400status code. Reduce the request header size. -
The client did not send the full request: The client closed the connection before sending the complete HTTP request. Capture packets on the client to identify the cause.
-
The request header format is invalid: For example, the value of
Content-Lengthdoes not match the actual length of the request body. Capture packets on the client, analyze the format of the HTTP request, and compare it with a valid request.
405 Method Not Allowed
The request method is not supported.
-
ALB restriction: ALB does not support the
TRACErequest method. Use a different method. -
Backend service restriction: Except for
TRACE, whether other request methods are supported depends on the backend server. To verify, runcurl -X METHOD http://<backend_service_IP>:<service_port>, whereMETHODis the request method used by the client.
408 Request Timeout
The request timed out, and ALB closes the connection.
-
Slow client data transmission: Within the client-to-ALB request timeout period (default: 60s), the client sent only partial data, such as sending only the
HTTP Headerbut not theHTTP Body. Capture packets on the client to check for performance bottlenecks or other issues. -
Poor network quality between the client and ALB: The TCP Round Trip Time (RTT) is high, or other network issues such as packet loss exist. We recommend that you check the
request_timeandtcpinfo_rttfields in the access log, or run network diagnostics on the client. -
ALB instance bandwidth throttling: High traffic to the ALB instance has triggered bandwidth throttling and packet loss. Check the
outbound bandwidthandDropped Connectionsmetrics in Cloud Monitor.
414 URI Too Long
The length of the request URI exceeds the limit, and ALB or the backend server has rejected the request.
-
ALB restriction: ALB requires that a request URI be no longer than 32 KB. Otherwise, a
414status code is returned. Shorten the URI. To transmit large amounts of data, you can use thePOSTmethod and place the data in the request body. ALB supports aPOSTrequest body of up to 50 GB. -
Backend service restriction: If the URI length does not exceed the ALB limit but the backend service has a stricter limit, ALB passes through the
414status code returned by the backend. Investigate the backend service.
463
The 463 status code is returned only when the listener is associated with an IP-type server group.
A loop exists in the request path. When a request passes through ALB, the system appends an ALICLOUD-ALB-TRACE field to the HTTP Header. The field value is a 16-character hash generated from the rule ID. If duplicate rule IDs are detected, or if the number of ALICLOUD-ALB-TRACE fields exceeds 16, ALB identifies a loop. ALB then stops forwarding the request to prevent a network storm and returns a 463 status code.
-
Backend service misconfiguration: The backend service is misconfigured, causing it to send requests back to ALB in a loop. Check your ALB's backend service configuration.
-
Network architecture flaw: For example, multiple load balancing instances exist in the forwarding path of a single request. We recommend that you optimize the network architecture.
499 Client Closed Request
The client actively closed the connection.
-
Poor network quality between the client and ALB: The TCP RTT is high, or other network issues such as packet loss exist. We recommend that you check the
request_timeandtcpinfo_rttfields in the access log, or run network diagnostics on the client. -
ALB instance bandwidth throttling: High traffic to the ALB instance has triggered bandwidth throttling and packet loss. Check the
outbound bandwidthandDropped Connectionsmetrics in Cloud Monitor. -
Long backend processing time: The backend processing time exceeded the client's timeout period. Check the
upstream_response_timefield in the access log, which indicates the backend processing time. If this value is consistently high, investigate the backend service for performance bottlenecks. -
The client request timeout is too short: The client closed the connection due to a timeout before it finished sending the request. Check the
request_timefield in the access log, which indicates the total request time. Use this value as a reference to set a more appropriate client-side request timeout. -
The client encountered an unknown issue: The client closed the connection before the request was completed. Investigate the client for behavior that could cause premature connection closure.
500 Internal Server Error
The backend server encountered an internal error and could not process the request.
-
The backend returns 500 directly: Check the access log. If
upstream_statusis500, ALB likely passed through the status code from the backend. Investigate the backend service. -
The backend server closed the connection unexpectedly: The backend server closed the connection before sending a complete response. Capture packets on the backend server to identify the cause of the unexpected connection closure.
503 Service Temporarily Unavailable
The server is temporarily unavailable, typically due to traffic exceeding limits or an unavailable backend service.
-
The backend returns 503 directly: Check the access log. If
upstream_statusis503, ALB likely passed through the status code from the backend. Investigate the backend service. -
The client request triggers ALB throttling:
-
In Cloud Monitor, check the
Requests per secondmetric. -
Cloud Monitor displays minute-level data and may not reflect second-level spikes. Check the access log. If the
upstream_statusfield is-, the request did not reach the backend server. -
Check the response packet header. If it contains the
ALB-QPS-Limited:Limitedfield, the request triggered ALB throttling.
-
-
Direct IP access or abnormal DNS resolution: This can concentrate traffic on only a few IP addresses and trigger throttling. Access ALB through its domain name (see Configure a CNAME for an ALB instance) and verify that DNS resolution works as expected.
-
The listener has no configured backend servers, or the configured backend servers have a weight of
0.
504 Gateway Timeout
ALB timed out while waiting for a response from the backend server.
-
The backend returns 504 directly: Check the access log. If
upstream_statusis504, ALB likely passed through the status code from the backend. Investigate the backend service. -
The connection attempt from ALB to the backend server times out: This timeout is 5 seconds by default and cannot be changed. Capture packets to identify why the backend server is not responding in time.
-
Backend response timeout: The connection request timeout is 60 seconds by default. You can check the
UpstreamResponseTimemetric in Cloud Monitor and theupstream_response_timefield in the access log to determine if the backend server's response timed out.