FAQ about FlexVolume storage

更新时间:
复制 MD 格式

This page covers common storage issues in Container Service for Kubernetes (ACK): volume mount failures, how to collect storage logs, and orphaned pod mount targets.

Category Questions
Storage Volume fails to mount · View storage logs · kubelet has a pod log not managed by ACK
Disk volumes Timeout error when mounting a disk to a node · Zone error when mounting a disk to an ECS instance · Input/output error after a system upgrade · The specified disk is not a portable disk · Volume node affinity conflict · Can't find disk · Disk size is not supported
NAS volumes Long mount time · Timeout error when mounting · chown: option not permitted · NAS volume fails to mount · alicloud-nas-controller task queue full
OSS volumes OSS volume fails to mount · Directory unavailable after cluster upgrade · Long mount time

Volume fails to mount

Check whether the required plug-ins are installed and running.

Check whether FlexVolume is installed

Run the following command and verify that one FlexVolume pod is running per cluster node:

kubectl get pod -n kube-system | grep flexvolume

Expected output:

flexvolume-4wh8s            1/1       Running   0          8d
flexvolume-65z49            1/1       Running   0          8d
flexvolume-bpc6s            1/1       Running   0          8d
flexvolume-l8pml            1/1       Running   0          8d
flexvolume-mzkpv            1/1       Running   0          8d
flexvolume-wbfhv            1/1       Running   0          8d
flexvolume-xf5cs            1/1       Running   0          8d

If any pods are not in the Running state, check the FlexVolume plug-in logs to identify the cause.

Check whether the dynamic provisioning plug-in is installed

Mounting dynamically provisioned disk volumes requires the disk provisioner plug-in. Run the following command to confirm it is running:

kubectl get pod -n kube-system | grep alicloud-disk

Expected output:

alicloud-disk-controller-8679c9fc76-lq6zb     1/1 Running   0   7d

If the pod is not in the Running state, check the plug-in logs to identify the cause.

View storage logs

ACK storage involves three log sources: the FlexVolume plug-in, the disk provisioner plug-in, and kubelet. Collect all three when diagnosing a mount failure.

FlexVolume logs on Master Node 1

  1. Find the pod on which the error occurs:

    kubectl get pod -n kube-system | grep flexvolume
  2. Print the logs for that pod:

    kubectl logs flexvolume-4wh8s -n kube-system
    kubectl describe pod flexvolume-4wh8s -n kube-system
    The last few entries in the describe output show the pod's recent events and are the most useful for diagnosing errors.
  3. To view per-driver logs (disk, NAS, and OSS) on the node itself, find the node's IP address:

    kubectl describe pod nginx-97dc96f7b-xbx8t | grep Node

    Expected output:

    Node: cn-hangzhou.i-bp19myla3uvnt6zi****/192.168.XX.XX
    Node-Selectors:  <none>
  4. SSH to the node and list the driver log files:

    ssh 192.168.XX.XX
    ls /var/log/alicloud/flexvolume*

    Expected output:

    flexvolume_disk.log  flexvolume_nas.log  flexvolume_oss.log

Disk provisioner plug-in logs on Master Node 1

  1. Find the pod on which the error occurs:

    kubectl get pod -n kube-system | grep alicloud-disk
  2. Print the logs for that pod:

    kubectl logs alicloud-disk-controller-8679c9fc76-lq6zb -n kube-system
    kubectl describe pod alicloud-disk-controller-8679c9fc76-lq6zb -n kube-system
    The last few entries in the describe output show the pod's recent events and are the most useful for diagnosing errors.

kubelet logs

  1. Find the IP address of the node that hosts the pod:

    kubectl describe pod nginx-97dc96f7b-xbx8t | grep Node
  2. SSH to the node and export the kubelet logs:

    ssh 192.168.XX.XX
    journalctl -u kubelet -r -n 1000 &> kubelet.log

    The -n flag controls how many log entries to export. Increase this value if you need to look further back in time.

If the problem persists after reviewing these logs, contact Alibaba Cloud technical support and include the log files.

kubelet has a pod log not managed by ACK

When a pod exceptionally exits, the mount target is not removed when the system unmounts the PV. As a result, the system fails to delete the pod, and kubelet cannot collect all volume garbage.

Run the following script on the affected node to remove invalid mount targets:

wget https://raw.githubusercontent.com/AliyunContainerService/kubernetes-issues-solution/master/kubelet/kubelet.sh
sh kubelet.sh