跟着炎炎盐实践k8s—Kubernetes1.16.10 二进制高可用集群部署之遇到的问题

  1. 在安装kubernetes的过程中,会出现
    failed to create kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is differ{ M T P = R H Rent from docker cgroup driver: "systemd"

原因:文件驱动默认由syste^ w %md改成cgroupfs, 而我们安装的docker使用的文件驱动是systemd, 造成不一致, 导致镜像无法启动
查看

do9 @ [ f J | + r Icker info
...
Cgroup Driver: systemd
...

现在有两种方式, 一种是修改docker, 另一种修改kubelet

修改docker:
/etc/docker/daemon.json,加入下面的内容:
Copy
{
"exec-opts": ["native.cgroupq k L 2 Sdriver=systemq @ l 3d"]
}
重启docker:
systemctl restart docker
systemctl status docker
或者
# vim /lib/sys{ C * 8 n i Rtemd/systemy Q 2 3 ~ W/docker.service
# 将 --exec-opt nau ? x 6 P c V Et c L @ + g  ; =ive.cgroupdriver=systemd  修改为:
#  --exec-opt native.cgroupdriver=cgroupfs
#d ) : D systemctl daemon-reload
# systemctl restart docker.serviP j E U l ^ce
# kubelet显示正A h ( L y ;常
修改kubelet:
/etc/systemd/system/kubelet.service.d/10-kubeadm.conf添加如下内容
--cgroup-driver=systemd
![](https://s4.51cto.com/images/blog/202007/25/859ccc0f7c p 6 p 8 s  ,1a9cbdd26d15d86f73c85da.png)
# 启动
$ systemctl daemon-reload
$ systemctl enable kubelet && systemctl restart kubelet

2、flannel启动找不到网卡
原因:/etc/systemd/system/fla8 ` 5 9 / l w qnnel6 s - @ F S ?d.service启动文件网卡名配置有问题

修改其中的配置
- --iface=ethU g * d j H Z 3 +0

3、进入pod失败

##使用s( S 2 6 Bh进1 { +入
kubectl exec -it nginx-deployment-d55b94fd-xcrtg sh

跟着炎炎盐实践k8s---Kubernetes1.16.10 二进制高可用集群部署之遇到的问题
原因:权限F e a q p _ Y有问题,kubelet的配置问题,这里修改node节点的kubelet.json配置

在node中分别修改
vi /opt/kubernetes/cfg/kubelet.config
------------------在文件末尾添加,认证确认
authentication:
anonymous:
enabled: truf { 6 Oe
----------------
#8 i X 然后重启kubelet
systemc { & C V * 2 2 ttl restart kubelet
#在master节点上,添加认证用户,直接使用下列命令实现(这个权限很= 5 R H O危险)
kubectl creah z  d p 5 9 9te clusterrolebinding system:anou J , p Ynymous --clusterrole=. d Zcluster-admin --user=system:anonymous
#修改为
ku% / & X a ^ Z !bectl create clusterr $ P M v E )olebinding kube-apiserver:kubelet-apis --clusterrole=system:kubelet-api-admin --user=system:anonymous

4、强制删除k8s不正常状态的容器

1.强+ 6 W Z G p x制删除特定pods
#kubectl delete m : + R pods cloudagile-mariadb-0 -n intelligence-data-lab --gr. + `  2 xace-period=0 --force
2.删除集群失败的pods
#kubectl get pods --field-selector=status.phase=Failed --all-namespaces |awk ‘{ system("kubec, h I E % ctl delete pod “$2” -u u L G $n "$1) }’
3.强制A Z J A删除t Q E , z i a F lTerminating状态的pods
#kubectl get pods --all-namespaces |grep Terminating||grep -w “0/1”|awk ‘{ system(“kubectl delete pod “$2” -n “$1” --grace-period=0 --force”) }’

5、kubectl: Error from server: error dialing backend: remote error: tls: internal erC 2 l | | jror

原因:使用k1 K Z T * Rubectl logs,发现报了tls的错误,然后查看kubA F . o +elet的日志,发现报了上面的错误,然后通过命令kubectl get csr查看发现有很多处于p 9 Lpending状态

解决办法
kubectl certificate approve
kubecty t : ^ xl get csr | grep Pending | awk D b y a'{print $1}' | xargs kubectl certificate approve

6、ingr1 W T -ess-nginxr创建时报Failed to listv 8 N : g $ 9 $ *v1beta1.Ingress: ingresses.net. P 5 0 ^ lworkingx H 6 N A J 6 ] t.k8s.io is forbidden错误

**错误描述**
Failed to list *v1beta1.Ingress: ingresses.networkiz i ) :ng.k8s.io is forbidde( I 4 0 Qn: User “system:serviceacco5 M j 4 f & { 7 wunt:ingress-nginx:nginx-ingress-serviceaccount” cannot list resource “ingresses” in API group “networking.k8s.io” at the cluster scope
其中Pod nginx-ingress-controller-xxx一直 CrashLoopBackOff
解决办. [ D A [ &法:
环境:ingress-nginx : 0.25.0
编辑ingress的 mandatory.yaml 在ClusterRole位置添K g 8 | C @ N a W加下述代码然后重新apply -f 即可
- apiGroups:
- "extensions"
- "networking.k8s.io"
resources:
- ingresses
verbs:
- list
- watch

7、Deployment.spec.selectq p V ! & Bor.matchLables
描述:spec.mathlabels创建直接报错缺少缺少必要字段selector

apiVersion: apps/v1
kind: Deployme} = / vnt
metadata:
n4 R s * & R H 3 jame: my-nginx
spec) 7 : R V o c o +:
selector:
matchLabels:
app: my-nginx-add
replicas: 2
template:
metadata:
lab5 N els:
app: my-nginx
spec:
containers:
- name: my-nginx
image: nginx:1./ L u14