问题描述
该笔记将记录:我们升级 Rook Ceph 集群的方法,以及相关问题的解决办法;
解决方案
参考 环境信息
Rook Ceph: 1.10.2 => 1.10.10
Kubernetes Cluster: 1.22.15
第一步、集群健康检查
Rook Ceph Documentation/Health Verification
# --------------------------------------------------------- # Pods all Running
# 确保如下命令无输出:
kubectl -n rook-ceph get pods --no-headers | grep -v -E '\sRunning\s|\sCompleted\s'
# --------------------------------------------------------- # Status Output
# 检查 HEALTH_OK,mon,mgr,mds,osd,rgw,pg 状态,确保出于正常:
TOOLS_POD=$(kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[*].metadata.name}')
kubectl -n rook-ceph exec -it $TOOLS_POD -- ceph status
# --------------------------------------------------------- # Container Versions
# 检查容器版本,确保各组件当前版本一致:
kubectl -n rook-ceph get pod -o jsonpath='{range .items[*]}{.metadata.name}{"\n\t"}{.status.phase}{"\t\t"}{.spec.containers[0].image}{"\t"}{.spec.initContainers[0].image}{"\n"}{end}'
# --------------------------------------------------------- # Rook Volume Health
# 检查使用 Rook Ceph 的容器处于健康状态;
# 在升级过程中,业务容器依旧能够正常运行;
第二步、升级 Operator 服务
Rook Ceph Documentation/Rook Upgrades
我们通过 Helm Chart 部署,所以 helm upgrade 使用新版本 Chart 即可:
helm pull rook-release/rook-ceph --version 1.10.10
vim rook-ceph.helm-values.yaml
...(1)修改镜像地址;
helm upgrade --namespace rook-ceph rook-ceph \
./rook-ceph-v1.10.10.tgz -f rook-ceph.helm-values.yaml
第三步、升级 Cluster 版本
Rook Ceph Documentation/Ceph Upgrades
helm pull rook-release/rook-ceph-cluster --version 1.10.10
vim rook-ceph-cluster.helm-values.yaml
...(1)修改镜像地址;
helm upgrade --namespace rook-ceph rook-ceph-cluster \
./rook-ceph-cluster-v1.10.10.tgz -f rook-ceph-cluster.helm-values.yaml
检查 Cluster 状态,集群组件将逐个升级到指定版本:
# kubectl --namespace rook-ceph describe cephcluster
...
Versions:
Mds:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 2
Mgr:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 2
Mon:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 3
Osd:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 5
Overall:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 13
Rgw:
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable): 1
...
参考文献
Rook Ceph Documentation/Health Verification