问题描述
集群无法访问,使用 journalctl -f -u kubelet.service 查看日志,提示如下错误:
11月 24 14:26:59 k8s-master1 kubelet[1768]: I1124 14:26:59.608288 1768 server.go:408] Version: v1.12.1 11月 24 14:26:59 k8s-master1 kubelet[1768]: I1124 14:26:59.608812 1768 plugins.go:99] No cloud provider specified. 11月 24 14:26:59 k8s-master1 kubelet[1768]: E1124 14:26:59.616261 1768 bootstrap.go:205] Part of the existing bootstrap client certificate is expired: 2019-11-23 12:18:53 +0000 UTC
原因分析
在 Kubernetes Cluster 中,当集群初时化时创建的证书一年到期。当到期后,集群的各个组件之间将无法访问,需要重新续期证书才能解决。
# kubeadm alpha certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Feb 21, 2023 03:37 UTC 335d no apiserver Feb 21, 2023 03:37 UTC 335d ca no apiserver-etcd-client Feb 21, 2023 03:37 UTC 335d etcd-ca no apiserver-kubelet-client Feb 21, 2023 03:37 UTC 335d ca no controller-manager.conf Feb 21, 2023 03:37 UTC 335d no etcd-healthcheck-client Feb 21, 2023 03:37 UTC 335d etcd-ca no etcd-peer Feb 21, 2023 03:37 UTC 335d etcd-ca no etcd-server Feb 21, 2023 03:37 UTC 335d etcd-ca no front-proxy-client Feb 21, 2023 03:37 UTC 335d front-proxy-ca no scheduler.conf Feb 21, 2023 03:37 UTC 335d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Feb 19, 2032 03:25 UTC 9y no etcd-ca Feb 19, 2032 03:25 UTC 9y no front-proxy-ca Feb 19, 2032 03:25 UTC 9y no
通过 OpenSSL 查看证书过期情况:
openssl x509 -in /etc/kubernetes/pki/apiserver.crt -noout -text |grep ' Not '
解决办法
重新生成证书……集群根证书:
/etc/kubernetes/pki/ca.key
由集群根证书颁发的证书:
/etc/kubernetes/pki/apiserver.key
/etc/kubernetes/pki/apiserver-kubelet-client.key
这些证书全部需要升级。
环境信息
操作系统:CentOS Linux release 7.4.1708 (Core)
集群环境:Kubernetes Cluster v1.18.20, Master * 3, Worker * n
备份配置文件(Master)
配置原始配置文件:
rsync -avz --delete /etc/kubernetes/ /etc/kubernetes.backup/ rsync -avz --delete ~/.kube/ ~/.kube.backup/
延长证书期限(Master)
# 执行如下命令,以续期全部证书: kubeadm -v 10 alpha certs renew all # 在 CentOS 7.4 and kubeadm 1.12.1 下,上述命令产生栈溢出错误, # 所以只能手动执行升级每个组件的证书 # kubeadm -v 10 alpha phase certs renew apiserver --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew apiserver-etcd-client --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew apiserver-kubelet-client --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew etcd-healthcheck-client --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew etcd-peer --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew etcd-server --config /etc/kubernetes/kubeadm-config.yaml # kubeadm -v 10 alpha phase certs renew front-proxy-client --config /etc/kubernetes/kubeadm-config.yaml
更新配置文件(Master)
重启 Static Pod(根据官方文档):
1)将 /etc/kubernetes/manifests/ 里的配置移动到其他目录(不要删除);
2)等待 20s(fileCheckFrequency 配置的 kubelet 文件扫描周期);
3)再将 Static Pod 配置 移入 /etc/kubernetes/manifests/ 目录;
或者,通过如下命令删除 Static Pod 实例:
# 1)删除当前节点 kubectl delete pods -n kube-system \ $(kubectl get pod --all-namespaces -o=jsonpath='{.items[?(@.metadata.ownerReferences[].name=="'${HOSTNAME}'")].metadata.name}') # 2)删除所有节点 kubectl delete pods -n kube-system \ $(kubectl get pod --all-namespaces -o=jsonpath='{.items[?(@.metadata.ownerReferences[].kind=="Node")].metadata.name}')
重启服务(所有节点)
systemctl restart kubelet.service
该步骤操作并没有必要:
1)kubelet.conf,虽然包含证书信息,但是当 kubeadm 部署 kubelet 时以为其配置滚动证书更新,其相关的配置保存在 /var/lib/kubelet/pki 中;
2)此外,在手动处理证书的场景中,官方文档也未提及要处理 kubelet 服务。
参考文献
Certificate Management with kubeadm | Kubernetes
Does JsonPath support the AND (&&) operator? – Stack Overflow
how to renew the certificate when apiserver cert expired? #581
JsonPath nested condition and multiple conditions are not working · Issue #20352 · kubernetes/kubernetes
JSONPath Support | Kubernetes
k8s踩坑(三)、kubeadm证书/etcd证书过期处理
k8s采坑记 – 证书过期之kubeadm重新生成证书
Kubeadm fails – kubelet fails to find /etc/kubernetes/bootstrap-kubelet.conf #3769
Kubelet fails to authenticate to apiserver due to expired certificate #65991
kubernetes – Filter kubectl get based on anotation – Stack Overflow
kubernetes – How to identify static pods via kubectl command? – Stack Overflow
Renew kubernetes pki after expired