系统日志提示错误如下:
1
| Apr 19 08:19:58 vpsz-dce-mgt03 kubelet: E0419 08:19:58.636618 76840 nestedpendingoperations.go:267] Operation for "\"kubernetes.io/secret/13b322de-35d5-11eb-a480-0242ac120004-default-token-thcng\" (\"13b322de-35d5-11eb-a480-0242ac120004\")" failed. No retries permitted until 2021-04-19 08:22:00.636581297 +0800 CST m=+26338789.158428442 (durationBeforeRetry 2m2s). Error: "UnmountVolume.TearDown failed for volume \"default-token-thcng\" (UniqueName: \"kubernetes.io/secret/13b322de-35d5-11eb-a480-0242ac120004-default-token-thcng\") pod \"13b322de-35d5-11eb-a480-0242ac120004\" (UID: \"13b322de-35d5-11eb-a480-0242ac120004\") : remove /var/lib/kubelet/pods/13b322de-35d5-11eb-a480-0242ac120004/volumes/kubernetes.io~secret/default-token-thcng: device or resource busy"
|
解决方法:
使用脚本检查目录被谁占用
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| #!/bin/bash declare -A map for i in `find /proc/*/mounts -exec grep $1 {} + 2>/dev/null | awk '{print $1"#"$2}'` do pid=`echo $i | awk -F "[/]" '{print $3}'` point=`echo $i | awk -F "[#]" '{print $2}'` mnt=`ls -l /proc/$pid/ns/mnt |awk '{print $11}'` map["$mnt"]="exist" cmd=`cat /proc/$pid/cmdline` echo -e "$pid\t$mnt\t$cmd\t$point" done
for i in `ps aux|grep docker-containerd-shim |grep -v "grep" |awk '{print $2}'` do mnt=`ls -l /proc/$i/ns/mnt 2>/dev/null | awk '{print $11}'` if [[ "${map[$mnt]}" == "exist" ]];then echo $mnt fi done
|
1
| sh test.sh /var/lib/kubelet/pods/13b322de-35d5-11eb-a480-0242ac120004/volumes/kubernetes.io~secret/default-token-thcng
|
可以看到被占用:
1
| 82263 mnt:[4026532308] /bin/node_exporter--path.procfs=/host/proc--path.sysfs=/host/sys--path.rootfs=/host/root--web.listen-address=0.0.0.0:9100--collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$--collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) /host/root/var/lib/kubelet/pods/13b322de-35d5-11eb-a480-0242ac120004/volumes/kubernetes.io~secret/default-token-thcng
|
找到82263的父进程:
1 2 3
| [root@vpsz-dce-mgt03 ~]# ps -ef |grep 82263 nfsnobo+ 82263 82237 0 2020 ? 08:51:33 /bin/node_exporter --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/host/root --web.listen-address=0.0.0.0:9100 --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$ --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) root 107711 84220 0 08:34 pts/1 00:00:00 grep --color=auto 82263
|
继续找到8345的父进程:
1 2 3 4
| [root@vpsz-dce-mgt03 g1703269]# ps -ef |grep 82237 root 82237 3452 0 2020 ? 00:00:08 docker-containerd-shim 125ebce164eb4705ab348fed5642ed649437dd0f8ad0aa2092208c55727d4e1d /var/run/docker/libcontainerd/125ebce164eb4705ab348fed5642ed649437dd0f8ad0aa2092208c55727d4e1d docker-runc nfsnobo+ 82263 82237 0 2020 ? 08:51:33 /bin/node_exporter --path.procfs=/host/proc --path.sysfs=/host/sys --path.rootfs=/host/root --web.listen-address=0.0.0.0:9100 --collector.filesystem.ignored-fs-types=^(autofs|binfmt_misc|cgroup|configfs|debugfs|devpts|devtmpfs|fusectl|hugetlbfs|mqueue|overlay|proc|procfs|pstore|rpc_pipefs|securityfs|sysfs|tracefs)$ --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/) root 114427 84220 0 08:35 pts/1 00:00:00 grep --color=auto 82237
|
可以看到是id为125ebce164eb4705ab348fed5642ed649437dd0f8ad0aa2092208c55727d4e1d的docker在占用