This page covers common storage issues in Container Service for Kubernetes (ACK): volume mount failures, how to collect storage logs, and orphaned pod mount targets.
Volume fails to mount
Check whether the required plug-ins are installed and running.
Check whether FlexVolume is installed
Run the following command and verify that one FlexVolume pod is running per cluster node:
kubectl get pod -n kube-system | grep flexvolume
Expected output:
flexvolume-4wh8s 1/1 Running 0 8d
flexvolume-65z49 1/1 Running 0 8d
flexvolume-bpc6s 1/1 Running 0 8d
flexvolume-l8pml 1/1 Running 0 8d
flexvolume-mzkpv 1/1 Running 0 8d
flexvolume-wbfhv 1/1 Running 0 8d
flexvolume-xf5cs 1/1 Running 0 8d
If any pods are not in the Running state, check the FlexVolume plug-in logs to identify the cause.
Check whether the dynamic provisioning plug-in is installed
Mounting dynamically provisioned disk volumes requires the disk provisioner plug-in. Run the following command to confirm it is running:
kubectl get pod -n kube-system | grep alicloud-disk
Expected output:
alicloud-disk-controller-8679c9fc76-lq6zb 1/1 Running 0 7d
If the pod is not in the Running state, check the plug-in logs to identify the cause.
View storage logs
ACK storage involves three log sources: the FlexVolume plug-in, the disk provisioner plug-in, and kubelet. Collect all three when diagnosing a mount failure.
FlexVolume logs on Master Node 1
-
Find the pod on which the error occurs:
kubectl get pod -n kube-system | grep flexvolume -
Print the logs for that pod:
kubectl logs flexvolume-4wh8s -n kube-system kubectl describe pod flexvolume-4wh8s -n kube-systemThe last few entries in the
describeoutput show the pod's recent events and are the most useful for diagnosing errors. -
To view per-driver logs (disk, NAS, and OSS) on the node itself, find the node's IP address:
kubectl describe pod nginx-97dc96f7b-xbx8t | grep NodeExpected output:
Node: cn-hangzhou.i-bp19myla3uvnt6zi****/192.168.XX.XX Node-Selectors: <none> -
SSH to the node and list the driver log files:
ssh 192.168.XX.XX ls /var/log/alicloud/flexvolume*Expected output:
flexvolume_disk.log flexvolume_nas.log flexvolume_oss.log
Disk provisioner plug-in logs on Master Node 1
-
Find the pod on which the error occurs:
kubectl get pod -n kube-system | grep alicloud-disk -
Print the logs for that pod:
kubectl logs alicloud-disk-controller-8679c9fc76-lq6zb -n kube-system kubectl describe pod alicloud-disk-controller-8679c9fc76-lq6zb -n kube-systemThe last few entries in the
describeoutput show the pod's recent events and are the most useful for diagnosing errors.
kubelet logs
-
Find the IP address of the node that hosts the pod:
kubectl describe pod nginx-97dc96f7b-xbx8t | grep Node -
SSH to the node and export the kubelet logs:
ssh 192.168.XX.XX journalctl -u kubelet -r -n 1000 &> kubelet.logThe
-nflag controls how many log entries to export. Increase this value if you need to look further back in time.
If the problem persists after reviewing these logs, contact Alibaba Cloud technical support and include the log files.
kubelet has a pod log not managed by ACK
When a pod exceptionally exits, the mount target is not removed when the system unmounts the PV. As a result, the system fails to delete the pod, and kubelet cannot collect all volume garbage.
Run the following script on the affected node to remove invalid mount targets:
wget https://raw.githubusercontent.com/AliyunContainerService/kubernetes-issues-solution/master/kubelet/kubelet.sh
sh kubelet.sh