「Kubernetes」- 搭建高可用集群(集群、搭建、高可用、1.17、keepalived)

问题描述

我们将通过 kubeadm 命令创建的 Kubernetes 1.22 高可用集群,并使用 kube-vip 作为负载均衡器;

该笔记将记录:如何搭建 Kubernetes Cluster(内部 etcd 服务),以及相关问题处理方法;

解决方案

根据官方 Kubernetes/Bootstrapping clusters with kubeadm 文档的指引,我们得以实施该部署过程;
该部署过程及内容并非依照官方文档顺序来记录,而是依照实际部署过程来记录,即我们遵循「先记录,后实施」的原则;

第零步、配置节点环境

操作系统:Ubuntu 20.04 LTS
软件版本:Kubernetes 1.22.13

检查节点环境

在部署集群之前,需要准备工作和环境检查:
1)兼容的 Linux 主机;我们使用 Ubuntu 20.04 TLS 发行版;
2)最少 2gMEM 资源;
3)2vCPU
4)集群中所有机器间的网络互联(公共或专用网络都可以),以及正确的网络配置;
5)每个节点 [hostname、mac address、product_uuid] 要唯一;
6)机器上的某些端口不能别占用,且要允许访问:6443;
7)禁用 SWAP 分区:

# 4)每个节点 [hostname、mac address、product_uuid] 要唯一;

ip link
hostname
cat /sys/class/dmi/id/product_uuid

# 检查主机网卡配置:
# 1)针对多网卡的主机设备,需要确保流量经过预期的网卡进行转发;

# 内核网络参数相关配置
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
br_netfilter
EOF

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
EOF
sudo sysctl --system

# 网络端口检查,确认开放比较端口
systemctl stop ufw
systemctl disable ufw
nc -v -z 127.0.0.1 6443

# 关闭 Swap 分区
# swapoff -a && sysctl -w vm.swappiness=0
yes | cp /etc/fstab /etc/fstab.backup
sed -i -E 's/(.+swap.+swap.+)/# \1/g' /etc/fstab
swapoff -a

// ---------------------------------------------------------------------------- // 针对 CentOS 环境

# 设置关闭 SELINUX(针对 CentOS 系统)
# setenforce 0
yes | cp /etc/selinux/config /etc/selinux/config.backup
sed -i 's%SELINUX=enforcing%SELINUX=disabled%g' /etc/selinux/config

部署容器服务

第一步、部署容器环境:
Docker 20.10.1(原因是我们的操作系统模板里预装 Docker 服务,而我们并不想重新替换为 containerd 服务)

第二步、修改 Docker 配置:

cat > /etc/docker/daemon.json <<EOF
{
  "live-restore": true,
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2"
}
EOF

systemctl start docker
systemctl enable docker
systemctl reload docker
systemctl restart docker

Configuring a cgroup driver | Kubernetes
The Container runtimes page explains that the systemd driver is recommended for kubeadm based setups instead of the cgroupfs driver, because kubeadm manages the kubelet as a systemd service.

安装基础组件

在所有节点中已经安装 kubeadm 及 kubelet 命令(命令 kubectl 可选):

# 安装依赖
apt-get update \
    && apt-get install -y apt-transport-https ca-certificates curl

# 导入密钥
# 尽管下载 apt-key.gpg 存在困难,但是应该尽量从官方站点下载(请勿随意使用第三方密钥)
# 我们直接利用 Ubuntu Key Server 来添加;
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys FEEA9169307EA071 8B57C5C2836F4BEB
cat > /etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt-get update

# 安装工具
# 查看可用版本:apt-cache madison kubeadm | grep 1.20 | head -n 1
apt-get install -y kubelet=1.22.13-00 kubeadm=1.22.13-00 kubectl=1.22.13-00
apt-mark hold kubelet kubeadm kubectl                                           # 禁止更新

# 其他工具
apt-get install nfs-common

针对 CentOS 系统,配置源仓库并安装必要的包(这里仅做记录,未经测试):

# 下面是官方源(网络通常不通,除非使用网络加速,YUM 支持)
cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg
        https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
EOF

# 使用阿里镜像站
cat <<EOF > /etc/yum.repos.d/kubernetes-ali.repo
[kubernetes-ali]
name=Kubernetes ALi
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=0
EOF

# 更新 YUM 缓存,并安装 kubeadm 工具(但是不需要启动服务)
yum makecache
yum install -y kubeadm-1.22.6 kubelet-1.22.6

# 如果没有 enable 服务,则在 kubeadm init 时会有警告;
# 但是不要 start 服务,这时候还没有初始化完成,缺少启动服务的某些配置文件(比如 /var/lib/kubelet/config.yaml 文件);
# 这得感谢群里朋友的反馈 :-)
systemctl enable kubelet

第一步、创建负载均衡器

负载均衡器有很多种创建方法(kubeadm/ha-considerations.md),我们使用软件实现的负载均衡器;

我们选择 kube-vip 作为负载均衡组件,所以这里仅记录该组件的相关内容。选择其原因:
1)其原因是 kube-vip 的特性多于 haproxy/keepalievd;kube-vip 的配置使用更加简单;
2)其更主要的原因是我们不愿意花费时间去深入了解 haproxy/keepalievd;
3)在实践中,鉴于磁盘性能不佳,导致 etcd 经常超时:而 kube-vip 依赖于 etcd 服务,如果 etcd 表现不佳会影响 kube-vip;而 haproxy/keepalievd 独立运作,直接针对服务进行端口检查,可靠性更高,不依赖 etcd 服务;

我们使用 kube-vip 的 ARP 模式,并通过 Static Pod 来运行该服务;

# KVVERSION=$(curl -sL https://api.github.com/repos/kube-vip/kube-vip/releases | jq -r ".[0].name")

export VIP=192.168.7.120
export INTERFACE=eth0
export KVVERSION=v0.4.4

alias kube-vip="docker run --network host --rm ghcr.io/kube-vip/kube-vip:$KVVERSION"

kube-vip manifest pod               \
    --interface $INTERFACE          \
    --address $VIP                  \
    --controlplane                  \
    --services                      \
    --arp                           \
    --leaderElection | tee /etc/kubernetes/manifests/kube-vip.yaml

第二步、初始化控制节点

鉴于“内部 etcd 服务”式集群,占用主机数量少、管理相对简单、并且具有高可用性,所以选择该类型集群;

#1 创建首个控制节点

// ---------------------------------------------------------------------------- // 创建配置文件

# cat > kubeadm-config.yaml <<EOF
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
clusterName: devops
imageRepository: registry.aliyuncs.com/google_containers
kubernetesVersion: 1.22.13
networking:
  dnsDomain: cluster.local
controlPlaneEndpoint: 192.168.7.120:6443
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true
EOF

// ---------------------------------------------------------------------------- // 创建控制节点

# kubeadm init --config=kubeadm-config.yaml --upload-certs --v=10
...
  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config
...
  kubeadm join 192.168.7.120:6443 --token sdr8wc.ktb7g4tfdnu9i60y \
	--discovery-token-ca-cert-hash sha256:86ef4776b7749abb5f9b079f1c3eea631fe37aadb782aeb4cf8c8512dc39abd5 \
	--control-plane --certificate-key 8c821083b4ad5330f313215d253177d04dec6636b627e4b24a2bc1899e8a7da0
...
kubeadm join 192.168.7.120:6443 --token sdr8wc.ktb7g4tfdnu9i60y \
	--discovery-token-ca-cert-hash sha256:86ef4776b7749abb5f9b079f1c3eea631fe37aadb782aeb4cf8c8512dc39abd5

// ---------------------------------------------------------------------------- // 进行证书许可

# kubectl get csr
NAME        AGE    SIGNERNAME                                    REQUESTOR                     REQUESTEDDURATION   CONDITION
csr-qb5ld   114s   kubernetes.io/kubelet-serving                 system:node:k8s-infra-cp121   <none>              Pending
csr-vkh7t   117s   kubernetes.io/kube-apiserver-client-kubelet   system:node:k8s-infra-cp121   <none>              Approved,Issued
csr-zgf8x   111s   kubernetes.io/kubelet-serving                 system:node:k8s-infra-cp121   <none>              Pending

# kubectl certificate approve csr-qb5ld 
certificatesigningrequest.certificates.k8s.io/csr-qb5ld approved

# kubectl certificate approve csr-zgf8x
certificatesigningrequest.certificates.k8s.io/csr-zgf8x approved

#2 安装集群网络插件

我们使用 Cilium 网络插件,原因是:1)其功能强大;2)其对 MetalLB 支持交好;

通过 Helm 进行部署:

helm repo add cilium https://helm.cilium.io/
helm repo update

helm pull cilium/cilium --version 1.12.1                                        # cilium-1.12.1.tgz
helm show values ./cilium-1.12.1.tgz > cilium-1.12.1.helm-values.yaml

helm --namespace kube-system                                                   \
    install cilium                                                             \
    ./cilium-1.12.1.tgz -f ./cilium-1.12.1.helm-values.yaml                    \
    --create-namespace

helm --namespace kube-system                                                   \
    upgrade cilium                                                             \
    ./cilium-1.12.1.tgz -f ./cilium-1.12.1.helm-values.yaml

# kubectl get pods --all-namespaces                                            \
    -o custom-columns=NAMESPACE:.metadata.namespace,NAME:.metadata.name,HOSTNETWORK:.spec.hostNetwork --no-headers=true \
    | grep '<none>' | awk '{print "-n "$1" "$2}' | xargs -L 1 -r kubectl delete pod
...

#3 添加剩余控制节点

根据 kubeadm init 命令的输出,可以使用命令直接添加主节点:

# ----------------------------------------------------------------------------- # 1)根据生成的命令来添加其他控制平面

kubeadm join 10.10.50.100:6443 --token 60k8ec.p9z48jzua0xek4sx \
    --discovery-token-ca-cert-hash sha256:d74dc9f59b2caa888afd312f4e216e33b89ba1d4bb1212a394561c963c4c6391 \
    --control-plane --certificate-key afa132db8efc9c2cebea7f53f3b52ec94c6adbbeb8add4fb8a1d25404e5c883f

# ----------------------------------------------------------------------------- # 2)创建 kube-vip 配置

...                                                                             # 再次执行 kube-vip.yaml 创建命令

第三步、添加工作节点

不再展开赘述,只需要执行命令(在控制节点初始化结束时,将输出工作节点的加入命令):

kubeadm join 192.168.7.120:6443 --token sdr8wc.ktb7g4tfdnu9i60y \
	--discovery-token-ca-cert-hash sha256:86ef4776b7749abb5f9b079f1c3eea631fe37aadb782aeb4cf8c8512dc39abd5

… CRI v1 runtime API is not implemented for endpoint …

# kubeadm join 172.31.253.120:6443 --token krhk5l.qsnyz0bstiyz42gf --discovery-token-ca-cert-hash sha256:dcd1baacff6241421463b74875c5b4dd3c2cb4f0b7a5056863924b9235a8e7aa 
[preflight] Running pre-flight checks

error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR CRI]: container runtime is not running: output: time="2023-04-03T09:47:45+08:00" level=fatal msg="validate service connection: CRI v1 runtime API is not implemented for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService"
, error: exit status 1
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

附加说明

修改 /etc/hosts 文件

如果主机名不是在系统安装时设置的,那需要确保 /etc/hosts 文件已将当前主机名解析到 127.0.0.1 地址。否则执行 kubeadm init 或 kubeadm join 命令将会失败;

测试 Cilium 插件

如果集群没有工作节点,那么 Cilium 的测试 Pod 将处于 Pending 状态,所以测试任务放在最后执行(而非集群初始化期间):

# kubectl -n kube-system get pods --watch
cilium-operator-cb4578bc5-q52qk         1/1     Running   0          4m13s
cilium-s8w5m                            1/1     Running   0          4m12s
coredns-86c58d9df4-4g7dd                1/1     Running   0          13m
coredns-86c58d9df4-4l6b2                1/1     Running   0          13m

# kubectl create ns cilium-test
# wget https://raw.githubusercontent.com/cilium/cilium/1.12.1/examples/kubernetes/connectivity-check/connectivity-check.yaml
# kubectl apply -n cilium-test -f connectivity-check.yaml

# kubectl get pods -n cilium-test
NAME                                                     READY   STATUS    RESTARTS   AGE
echo-a-76c5d9bd76-q8d99                                  1/1     Running   0          66s
echo-b-795c4b4f76-9wrrx                                  1/1     Running   0          66s
echo-b-host-6b7fc94b7c-xtsff                             1/1     Running   0          66s
host-to-b-multi-node-clusterip-85476cd779-bpg4b          1/1     Running   0          66s
host-to-b-multi-node-headless-dc6c44cb5-8jdz8            1/1     Running   0          65s
pod-to-a-79546bc469-rl2qq                                1/1     Running   0          66s
pod-to-a-allowed-cnp-58b7f7fb8f-lkq7p                    1/1     Running   0          66s
pod-to-a-denied-cnp-6967cb6f7f-7h9fn                     1/1     Running   0          66s
pod-to-b-intra-node-nodeport-9b487cf89-6ptrt             1/1     Running   0          65s
pod-to-b-multi-node-clusterip-7db5dfdcf7-jkjpw           1/1     Running   0          66s
pod-to-b-multi-node-headless-7d44b85d69-mtscc            1/1     Running   0          66s
pod-to-b-multi-node-nodeport-7ffc76db7c-rrw82            1/1     Running   0          65s
pod-to-external-1111-d56f47579-d79dz                     1/1     Running   0          66s
pod-to-external-fqdn-allow-google-cnp-78986f4bcf-btjn7   1/1     Running   0          66s

# kubectl delete ns cilium-test

常见问题处理

Error from server: error dialing backend: remote error: tls: internal error

1736286 – error dialing backend: remote error: tls: internal error

# kubectl logs -f metallb-speaker-2vqgx                                                                                                                                                                                                                                     
Error from server: Get "https://192.168.7.122:10250/containerLogs/metallb-system/metallb-speaker-2vqgx/speaker?follow=true": remote error: tls: internal error 

# kubectl exec -it metallb-controller-6dbd7b64f6-zl2pg -- bash                                                                                                                                                                                                              
Error from server: error dialing backend: remote error: tls: internal error

# kubectl get csr                                                                                                                                                                                                                                                           
NAME        AGE     SIGNERNAME                      REQUESTOR                     REQUESTEDDURATION   CONDITION                                                                                                                                                             
csr-255j8   9h      kubernetes.io/kubelet-serving   system:node:k8s-infra-cp122   <none>              Pending
...

# kubectl get csr | awk '{print $1}' | xargs -i kubectl certificate approve '{}'

参考文献

Creating Highly Available clusters with kubeadm
Certificate Management with kubeadm | Kubernetes
Make metrics-server work out of the box with kubeadm
Troubleshooting kubeadm | Kubernetes