问题描述
该笔记将记录:我们升级 Rook Ceph 集群的方法,以及相关问题的解决办法;
解决方案
参考 环境信息
Rook Ceph: 1.10.2 => 1.10.10
Kubernetes Cluster: 1.22.15
第一步、集群健康检查
Rook Ceph Documentation/Health Verification
# --------------------------------------------------------- # Pods all Running # 确保如下命令无输出: kubectl -n rook-ceph get pods --no-headers | grep -v -E '\sRunning\s|\sCompleted\s' # --------------------------------------------------------- # Status Output # 检查 HEALTH_OK,mon,mgr,mds,osd,rgw,pg 状态,确保出于正常: TOOLS_POD=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[*].metadata.name}') kubectl -n rook-ceph exec -it $TOOLS_POD -- ceph status # --------------------------------------------------------- # Container Versions # 检查容器版本,确保各组件当前版本一致: kubectl -n rook-ceph get pod -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.status.phase}{"\t\t"}{.spec.containers[0].image}{"\t"}{.spec.initContainers[0].image}{"\n"}{end}' # --------------------------------------------------------- # Rook Volume Health # 检查使用 Rook Ceph 的容器处于健康状态; # 在升级过程中,业务容器依旧能够正常运行;
第二步、升级 Operator 服务
Rook Ceph Documentation/Rook Upgrades
我们通过 Helm Chart 部署,所以 helm upgrade 使用新版本 Chart 即可:
helm pull rook-release/rook-ceph --version 1.10.10 vim rook-ceph.helm-values.yaml ...(1)修改镜像地址; helm upgrade --namespace rook-ceph rook-ceph \ ./rook-ceph-v1.10.10.tgz -f rook-ceph.helm-values.yaml
第三步、升级 Cluster 版本
Rook Ceph Documentation/Ceph Upgrades
helm pull rook-release/rook-ceph-cluster --version 1.10.10 vim rook-ceph-cluster.helm-values.yaml ...(1)修改镜像地址; helm upgrade --namespace rook-ceph rook-ceph-cluster \ ./rook-ceph-cluster-v1.10.10.tgz -f rook-ceph-cluster.helm-values.yaml
检查 Cluster 状态,集群组件将逐个升级到指定版本:
# kubectl --namespace rook-ceph describe cephcluster ... Versions: Mds: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 2 Mgr: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 2 Mon: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 3 Osd: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 5 Overall: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 13 Rgw: ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 1 ...
参考文献
Rook Ceph Documentation/Health Verification