0%

Kubernetes集群中添加node节点导致flannel因网卡名不一致启动失败

问题

加入一节点到k8s中,发现新节点的守护容器kube-flannel-ds启动失败。

查看pod状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[root@master ~]# kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE
coredns-58cc8c89f4-5g6q7 1/1 Running 9 137d
coredns-58cc8c89f4-9f6xf 1/1 Running 8 137d
etcd-master 1/1 Running 8 137d
kube-apiserver-master 1/1 Running 10 137d
kube-controller-manager-master 1/1 Running 13 137d
kube-flannel-ds-amd64-7gxkk 1/1 Running 4 137d
kube-flannel-ds-amd64-7q2xg 1/1 Running 6 137d
kube-flannel-ds-amd64-jqkxc 1/1 Running 6 51d
kube-flannel-ds-amd64-mlcp7 1/1 Running 8 137d
kube-flannel-ds-amd64-mmcj5 0/1 CrashLoopBackOff 7 16m
kube-proxy-4276m 1/1 Running 6 137d
kube-proxy-8td8l 1/1 Running 0 21m
kube-proxy-chv82 1/1 Running 4 51d
kube-proxy-jv9vg 1/1 Running 11 137d
kube-proxy-v82vj 1/1 Running 5 137d
kube-scheduler-master 1/1 Running 11 137d

查看pod详细状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
[root@master ~]# kubectl describe pods kube-flannel-ds-amd64-mmcj5 -n kube-system
...
...
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled <unknown> default-scheduler Successfully assigned kube-system/kube-flannel-ds-amd64-mmcj5 to k8s-master
Normal Pulling 17m kubelet, k8s-master Pulling image "quay.io/coreos/flannel:v0.11.0-amd64"
Normal Pulled 15m kubelet, k8s-master Successfully pulled image "quay.io/coreos/flannel:v0.11.0-amd64"
Normal Created 14m kubelet, k8s-master Created container install-cni
Normal Started 14m kubelet, k8s-master Started container install-cni
Normal Pulled 13m (x4 over 14m) kubelet, k8s-master Container image "quay.io/coreos/flannel:v0.11.0-amd64" already present on machine
Normal Created 13m (x4 over 14m) kubelet, k8s-master Created container kube-flannel
Normal Started 13m (x4 over 14m) kubelet, k8s-master Started container kube-flannel
Warning BackOff 2m8s (x61 over 14m) kubelet, k8s-master Back-off restarting failed container

查看pod日志

1
2
3
[root@master ~]# kubectl logs -f kube-flannel-ds-amd64-mmcj5 -n kube-system
I0330 01:50:24.544864 1 main.go:210] Could not find valid interface matching ens192: error looking up interface ens192: route ip+net: no such network interface
E0330 01:50:24.545148 1 main.go:234] Failed to find interface to use that matches the interfaces and/or regexes provided

查看pod文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[root@master ~]# kubectl edit pods kube-flannel-ds-amd64-mmcj5 -n kube-system
...
...
...
containers:
- args:
- --ip-masq
- --kube-subnet-mgr
- --iface=ens192
command:
- /opt/bin/flanneld
...
...
...

原因、解决

查看需要添加node节点的网卡信息

1
2
3
4
5
6
7
8
9
[root@k8s-master ~]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
2: enp1s0f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:15:17:a1:e7:aa brd ff:ff:ff:ff:ff:ff
inet 192.168.100.65/24 brd 192.168.100.255 scope global enp1s0f0
valid_lft forever preferred_lft forever

需改ds yaml配置文件信息

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[root@master ~]# kubectl edit ds kube-flannel-ds-amd64 -n kube-system
...
...
...
containers:
- args:
- --ip-masq
- --kube-subnet-mgr
- --iface=ens192
- --iface=enp1s0f0
#- --iface-regex=eth*|ens*
command:
- /opt/bin/flanneld
...
...
...

参考资料

https://coreos.com/flannel/docs/latest/flannel-config.html