「ROOK-CEPH」- 安装仪表盘(Dashboard)

问题描述

该笔记将记录:部署 Ceph Dashboard 的方法,以对集群进行管理、监控等操作。

解决方案

诸如 PVE、Rook Ceph 等等,都使用 Ceph 服务,所以我们将分别讨论各种环境启用服务的方法;

for Proxmox VE

[TUTORIAL] – [Nautilus] activating CEPH DASHBOARD
Ceph Dashboard — Ceph Documentation

apt-get install ceph-mgr-dashboard                          # 需要在每个节点中执行

# 配置 TLS 信息
ceph config set mgr mgr/dashboard/ssl false                 # 我们选择禁用

# 创建管理用户
ceph dashboard ac-user-create admin administrator -i /tmp/password

# 查看访问地址
ceph mgr services

# 浏览器访问:8080 for HTTP, 8443 for HTTPS
...

for Rook Ceph

参考 Rook Ceph Documentation/Ceph Dashboard 文档,以获取详细说明;

服务开启

我们通过 Rook Ceph 的 Helm Chart 进行部署,所以该部分描述的配置方法与此部署方式相关;

rook-ceph-cluster.helm-values.yaml

...
cephClusterSpec:
  dashboard:
    enabled: true
...
ingress:
  dashboard:
    annotations:
      cert-manager.io/cluster-issuer: letsencrypt
      nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
      nginx.ingress.kubernetes.io/server-snippet: |
        proxy_ssl_verify off;
    host:
      name: ceph-dashboard.example.com
    tls:
    - hosts:
      - ceph-dashboard.example.com
      secretName: ceph-dashboard
    ingressClassName: nginx
...

用户登录

# 用户:
admin

# 密码:
kubectl -n rook-ceph get secret rook-ceph-dashboard-password \
    -o jsonpath="{['data']['password']}" | base64 --decode && echo

[WIP] 查看 Dashboard / Physical Disks 信息(存在问题)

默认 Dashboard / Physical Disks 并不会显示,需要开启 Rook Manager Module 才能正常工作;

rook-ceph-cluster.helm-values.yaml:

...
cephClusterSpec:
  mgr:
    modules:
      - name: rook
        enabled: true
...

rook-ceph.helm-values.yaml:

# ROOK_ENABLE_DISCOVERY_DAEMON: true
enableDiscoveryDaemon: true

# ROOK_DISCOVER_DEVICES_INTERVAL: 60m                       // 在 Helm Chart 中,我们未找到对应的配置方法

在部署后,访问 Dashboard 界面,产生如下错误:

[b'{"status": "500 Internal Server Error", "detail": "The 
server encountered an unexpected condition which prevented 
it from fulfilling the request.", "request_id": "e45ae619-2303-41e4-b59f-fdcd166bfc50"}']

// 查看 MGR 日志,显示如下错误信息:

...
[rook ERROR rook.rook_cluster] No storage class exists matching configured Rook orchestrator storage class which currently is <local>. This storage class can be set in ceph config (mgr/rook/storage_class)[rook ERROR orchestrator._interface] No storage class exists matching name provided in ceph config at mgr/rook/storage_class

// 我们尝试创建 local 存储类
// https://github.com/rook/rook/blob/master/tests/integration/ceph_mgr_test.go

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: local
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

// 但是,依旧会产生如下错误

MGR_MODULE_ERROR: Module 'rook' has failed: ({'type': 'ERROR', 'object': {'api_version': 'v1', 'kind': 'Status', 'metadata': {'annotations': None, 'cluster_name': None, 'creation_timestamp': None, 'deletion_grace_period_seconds': None, 'deletion_timestamp': None, 'finalizers': None, 'generate_name': None, 'generation': None, 'initializers': None, 'labels': None, 'managed_fields': None, 'name': None, 'namespace': None, 'owner_references': None, 'resource_version': None, 'self_link': None, 'uid': None}, 'spec': None, 'status': {'addresses': None, 'allocatable': None, 'capacity': None, 'conditions': None, 'config': None, 'daemon_endpoints': None, 'images': None, 'node_info': None, 'phase': None, 'volumes_attached': None, 'volumes_in_use': None}}, 'raw_object': {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'too old resource version: 96328407 (96328413)', 'reason': 'Expired', 'code': 410}}) Reason: None
MON_DOWN: 1/3 mons down, quorum f,g 

参考文献

Ceph Documentation/Ceph Dashboard
Rook Ceph Documentation/Ceph Dashboard