Troubleshoot service issues-Container Service for Kubernetes(ACK)-阿里云帮助中心

Prerequisites

CCM component version is V1.9.3.276-g372aa98-aliyun or later (upgrade instructions, release notes).

Diagnostic process

Identify the source of a LoadBalancer service issue.

Identify the service associated with the CLB instance. Replace XXX.XXX.XXX.XXX with the load balancer IP address.
```
kubectl get svc -A | grep -i LoadBalancer | grep {XXX.XXX.XXX.XXX}
```
A healthy service shows output similar to:
```
default   my-svc   LoadBalancer   10.x.x.x   XXX.XXX.XXX.XXX   80:32xxx/TCP   5d
```
Run the following command to check whether the service has error events.
```
kubectl -n {your-namespace} describe svc {your-svc-name}
```
Check the Events section at the bottom. Error output example:
```
Events:
  Type     Reason                  Age   From                Message
  ----     ------                  ---   ----                -------
  Warning  SyncLoadBalancerFailed  2m    service-controller  <error message here>
```
- If error events exist, match the error message in Service error events and solutions.
- If no error events exist, use the symptom-based guide in Troubleshooting methods.

Service error events and solutions

Run kubectl -n {your-namespace} describe svc {your-svc-name} and match the error message in the Events section to the table below.

Error message	Cause	Solution
`The backend server number has reached to the quota limit of this load balancers`	The CLB instance has reached the 200-backend-server quota limit.	Do one of the following: 1. Request a quota increase on the SLB Quota Management page . 2. Set `externalTrafficPolicy: Local` to reduce backend count. With Cluster mode, add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-backend-label annotation to limit backend nodes. 3. Create a new CLB instance.
`The loadbalancer does not support backend servers of eni type`	Shared CLB instances do not support Elastic Network Interface (ENI) backends.	Add the annotation `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec: "slb.s1.small"` to use a high-performance CLB instance. Verify CCM version compatibility. See Use annotations to configure a Classic Load Balancer (CLB) instance.
`There are no available nodes for LoadBalancer`	The CLB instance has no backend servers.	Check the pod status: <br>- If no pod matches the service, add one. <br>- If the pod is unhealthy, resolve the issue. See Troubleshoot pod issues. <br>- If the pod runs but is not a backend, check if it is on a master node and move it to a worker node.
`alicloud: not able to find loadbalancer named [%s] in openapi, but it's defined in service.loaderbalancer.ingress...` or `alicloud: can not find loadbalancer, but it's defined in service`	The CLB instance referenced by the service cannot be found.	Search for the CLB instance in the Server Load Balancer console using the service's `EXTERNAL-IP`. <br>- If the CLB no longer exists and the service is unneeded, delete it. <br>- If the CLB exists and was created manually, add the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id` annotation. See Use annotations to configure a Classic Load Balancer (CLB) instance. <br>- If the CLB was created by CCM, add the `kubernetes.do.not.delete` label to the CLB instance. See How do I rename an SLB instance if I am using an earlier version of CCM?.
`ORDER.ARREARAGE Message: The account is arrearage.`	The account has an overdue payment.	Settle the overdue payment.
`PAY.INSUFFICIENT_BALANCE Message: Your account does not have enough balance.`	The account balance is insufficient. Your account balance is less than CNY 100. Top up your account.	Top up the account balance.
`Status Code: 400 Code: Throttlingxxx`	The CLB OpenAPI is being throttled.	1. Check your CLB quota on the SLB Quota Management page. <br>2. Check for service errors and resolve them: `kubectl -n {your-namespace} describe svc {your-svc-name}`.
`Status Code: 400 Code: RspoolVipExist Message: there are vips associating with this vServer group.`	The listener linked to the vServer group cannot be deleted.	1. Check whether the service annotation contains a CLB ID: `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id: {your-clb-id}`. If present, the CLB is being reused. <br>2. In the CLB console, delete the listener for the port defined in the service. See Configure listener forwarding rules.
`Status Code: 400 Code: NetworkConflict`	The internal-facing CLB instance is in a different Virtual Private Cloud (VPC) than the cluster.	Move the CLB instance to the same VPC as the cluster, or create a new CLB instance in the correct VPC.
`Status Code: 400 Code: VSwitchAvailableIpNotExist Message: The specified VSwitch has no available ip.`	The vSwitch has no available IP addresses.	Add the annotation `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-vswitch-id: "${YOUR_VSWITCH_ID}"` to specify a different vSwitch in the same VPC.
`Message：The specified VSwitch does not exist.`	The specified vSwitch does not exist. Solution: If the service uses the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-vswitch-id` annotation, verify that the specified vSwitch exists. If the service does not use the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-vswitch-id` annotation, verify that the default vSwitch ID for the cluster exists. You can view theNode vSwitch on the Basic Information tab of the default node pool (default-nodepool). See Create and manage node pools. If the default vSwitch does not exist, use the annotation to specify a different vSwitch.
`The specified Port must be between 1 and 65535.`	ENI mode does not support string values for `targetPort`.	Change `targetPort` to an integer in the service YAML, or upgrade CCM. See Upgrade the CCM component.
`Status Code: 400 Code: ShareSlbHaltSales Message: The share instance has been discontinued.`	Older CCM versions create shared CLB instances by default, which are now discontinued.	Upgrade the CCM component.
`can not change ResourceGroupId once created`	The CLB resource group cannot be changed after instance creation.	Remove the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-resource-group-id:"rg-xxxx"` annotation from the service.
`can not find eniid for ip x.x.x.x in vpc vpc-xxxx`	ENI IP not found in the VPC. `service.beta.kubernetes.io/backend-type: eni` is set but the cluster uses Flannel, which does not support ENI mode.	Remove the `service.beta.kubernetes.io/backend-type: eni` annotation from the service.
`The operation is not allowed because the instanceChargeType of loadbalancer is PayByCLCU.` or `User does not have permission modify InstanceChargeType to spec.`	The CLB billing method cannot change from pay-as-you-go (PayByCLCU) to pay-by-specification.	Do one of the following: <br>- Remove the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-spec` annotation. <br>- If the service has the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-instance-charge-type` annotation, set its value to `PayByCLCU`.
`SyncLoadBalancerFailed the loadbalancer xxx can not be reused, can not reuse loadbalancer created by kubernetes.`	The CLB instance was created by CCM and cannot be reused via the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id` annotation.	1. Find the CLB ID in the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id` annotation of the service YAML. <br>2. Resolve based on service status: <br>  - Service is pending: Replace the CLB ID with one created manually in the Classic Load Balancer (CLB) console. <br>  - Service is not pending, CLB IP matches the service EXTERNAL-IP: Delete the `service.beta.kubernetes.io/alibaba-cloud-loadbalancer-id` annotation. <br>  - Service is not pending, CLB IP does not match: Find the CLB matching the service EXTERNAL-IP in the console and update the annotation. If no match, use a manually created CLB ID and recreate the service.
`alicloud: can not change LoadBalancer AddressType once created. delete and retry`	The CLB instance type cannot be changed after creation.	Delete the service and recreate it.
`the loadbalancer lb-xxxxx can not be reused, service has been associated with ip [xxx.xxx.xxx.xxx], cannot be bound to ip [xxx.xxx.xxx.xxx]`	The service is bound to a CLB instance and cannot be rebound by changing the annotation.	Delete the service and recreate it with the correct CLB instance ID.

Troubleshooting methods

For issues that do not produce error events, use the following symptom-based guide.

Issue	Symptom	Solution
CLB access issues	Uneven load distribution across backends	Uneven load distribution across CLB backends
	503 error during application updates	503 error during application updates
	CLB inaccessible from within the cluster	CLB inaccessible from within the cluster
	CLB inaccessible from outside the cluster	CLB inaccessible from outside the cluster
	"The plain HTTP request was sent to HTTPS port" error	Cannot connect to the backend HTTPS service
CLB configuration issues	Service annotations do not take effect	What do I do if service annotations do not take effect?
	CLB configuration is unexpectedly modified	Why is the configuration of my CLB instance modified?
	Reusing an existing CLB instance does not take effect	Service FAQ
	No listener configured when reusing an existing CLB instance	Why is no listener configured when I reuse an existing CLB instance?
	Inconsistent CLB backends	What do I do if the SLB vServer group is not updated?
CLB deletion issues	CLB instance is unexpectedly deleted	When is an SLB instance automatically deleted?
	CLB instance is not deleted after the service is deleted	When is an SLB instance automatically deleted?

Uneven load distribution across CLB backends

Cause: CLB scheduling algorithm is not suited to the traffic pattern.

Symptom: Uneven request distribution across backend servers.

Solution:

For services with externalTrafficPolicy: Local, add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-scheduler:"wrr" annotation to use weighted round-robin scheduling.
For services using persistent connections, add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-scheduler:"wlc" annotation for weighted least connections scheduling. Prevents one long-lived connection from monopolizing traffic.

To capture container network packets for load distribution analysis, see this Alibaba Cloud Developer community article.

503 error during application updates

Cause: Connection draining or pod graceful termination is not configured. During rolling updates, CLB may route traffic to terminating pods.

Symptom: 503 error when accessing the CLB during an application update.

Solution:

Add the service.beta.kubernetes.io/alibaba-cloud-loadbalancer-connection-drain annotation to enable connection draining. See Common operations to manage listeners.

Configure readinessProbe and preStop on the pod:

readinessProbe : Pods join CLB backends only after passing the probe. Set probe frequency, delay, and failure threshold to match your application's startup time. Too-short timeouts cause repeated pod restarts.
preStop and terminationGracePeriodSeconds : Set preStop to the time your application needs to drain in-flight requests. Set terminationGracePeriodSeconds to at least 30 seconds longer than preStop.

apiVersion: v1
kind: Pod
metadata:
  name: nginx
  namespace: default
spec:
  containers:
  - name: nginx
    image: nginx
    # Liveness probe
    livenessProbe:
      failureThreshold: 3
      initialDelaySeconds: 30
      periodSeconds: 30
      successThreshold: 1
      tcpSocket:
        port: 5084
      timeoutSeconds: 1
    # Readiness probe
    readinessProbe:
      failureThreshold: 3
      initialDelaySeconds: 30
      periodSeconds: 30
      successThreshold: 1
      tcpSocket:
        port: 5084
      timeoutSeconds: 1
    # Graceful termination
    lifecycle:
      preStop:
        exec:
          command:
          - sleep
          - 30
  terminationGracePeriodSeconds: 60

CLB inaccessible from within the cluster

Cause: externalTrafficPolicy: Local is set on the service. kube-proxy only forwards traffic to pods on the same node as the request origin. If the node has no backend pod for the service, the connection fails. This affects in-cluster traffic routed to the CLB address. See kube-proxy adds external-lb address to node-local iptables rule.

Symptom: CLB is accessible from outside the cluster but connections fail from within.

Solution: Use one of the following approaches:

Access via ClusterIP or service name (recommended for in-cluster access): Use the service's ClusterIP or DNS name instead of the CLB address. For Ingress, the service name is nginx-ingress-lb.kube-system.
Switch to `externalTrafficPolicy: Cluster`: In-cluster traffic reaches the service regardless of pod placement, but client source IP is not preserved. To modify the Ingress service:

With an Ingress CLB, pods can only access Ingress/CLB-exposed services from the node running the Ingress pod.
```
kubectl edit svc nginx-ingress-lb -n kube-system
```
Use `externalTrafficPolicy: Cluster` with ENI pass-through (Terway only): If your cluster uses Terway with ENIs or multiple IPs per ENI, set externalTrafficPolicy: Cluster and add the service.beta.kubernetes.io/backend-type: "eni" annotation. This preserves source IP and enables in-cluster access. See Use annotations to configure a Classic Load Balancer (CLB) instance.
```
apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/backend-type: eni
  labels:
    app: nginx-ingress-lb
  name: nginx-ingress-lb
  namespace: kube-system
spec:
  externalTrafficPolicy: Cluster
```

CLB inaccessible from outside the cluster

Cause: An ACL blocks the client IP, the CLB vServer group has no backends, or the health check is failing.

Symptom: The CLB instance cannot be reached from outside the cluster.

Solution:

Check for service error events and resolve them. See Service error events and solutions.
```
kubectl -n {your-namespace} describe svc {your-svc-name}
```
Check whether an ACL is configured on the CLB instance. If so, verify it allows inbound traffic from the client IP. See Resource Access Management.
Check whether the CLB vServer group is empty. If empty, verify a pod is associated with the service and running. If unhealthy, resolve the pod issue first. See Troubleshoot pod issues.
Check whether the CLB listener health check passes. If failing, verify the pod responds correctly. See CLB health check FAQ.

Cannot connect to the backend HTTPS service

Cause: With a certificate on the CLB listener, CLB terminates TLS and forwards HTTP to backends. If targetPort points to an HTTPS port (e.g., 443), the pod rejects the plaintext request with "The plain HTTP request was sent to HTTPS port."

Symptom: Backend connections fail after configuring HTTPS on the CLB listener.

Solution: Set targetPort to the pod's HTTP port. For example, if Nginx serves HTTPS on 443, set targetPort to 80.

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-protocol-port: "https:443"
    service.beta.kubernetes.io/alibaba-cloud-loadbalancer-cert-id: "${YOUR_CERT_ID}"
  name: nginx
  namespace: default
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  - name: https
    port: 443
    protocol: TCP
    targetPort: 80
  selector:
    run: nginx
  type: LoadBalancer

Prerequisites

Diagnostic process

Service error events and solutions

Troubleshooting methods

Uneven load distribution across CLB backends

503 error during application updates

CLB inaccessible from within the cluster

CLB inaccessible from outside the cluster

Cannot connect to the backend HTTPS service

Next steps