「KUBEADM」- 部署 Kubernetes 集群 | 1.30.0 | 测试环境 | All in One

背景:预算有限,但要借助 Kubernetes 特性,所以我们部署单 Master 节点的 Kubernetes 集群,并且移除 Taint 以允许业务 Pod 调度到该节点上。

该笔记主要参考如下两篇文档:

检查工作

  • A compatible Linux host. The Kubernetes project provides generic instructions for Linux distributions based on Debian and Red Hat, and those distributions without a package manager.
    • 我们该次部署的操作系统满足该要求;
  • 2 GB or more of RAM per machine (any less will leave little room for your apps).
    • 我们该次部署的内存满足该要求
  • 2 CPUs or more.
    • 我们该次部署的 CPU 满足该要求;
  • Full network connectivity between all machines in the cluster (public or private network is fine).
    • If you have more than one network adapter, and your Kubernetes components are not reachable on the default route, we recommend you add IP route(s) so Kubernetes cluster addresses go via the appropriate adapter. ⇒ 我们使用单网卡,无需担心该问题;
  • Unique hostname, MAC address, and product_uuid for every node. See here for more details.
    • You can get the MAC address of the network interfaces using the command ip link or ifconfig -a
      • 该次部署使用单节点,无需担心该问题;
    • The product_uuid can be checked by using the command sudo cat /sys/class/dmi/id/product_uuid
      • 该次部署使用单节点,无需担心该问题;
  • Certain ports are open on your machines. See here for more details.
    • 针对该环境,该次部署能够关闭防火墙:systemctl stop ufw.service; systemctl disable ufw.service;

Enable IPv4 packet forwarding

cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
EOF

sudo sysctl --system

Swap configuration.

  • The default behavior of a kubelet was to fail to start if swap memory was detected on a node. See Swap memory management for more details.
    • You MUST disable swap if the kubelet is not properly configured to use swap. For example, sudo swapoff -a will disable swapping temporarily. To make this change persistent across reboots, make sure swap is disabled in config files like /etc/fstab, systemd.swap, depending how it was configured on your system
      • 针对该系统,通过命令进行关闭:swapoff -a; vim /etc/fstab;

Installing a container runtime

  • 该次部署,我们使用 containerd 运行时;

  • Configuring a cgroup driver
# mkdir /etc/containerd/

# containerd config default > /etc/containerd/config.toml
...
            SystemdCgroup = true
...
    sandbox_image = "registry.aliyuncs.com/google_containers/pause:3.9"
...

# systemctl restart containerd.service

Kubernetes v1.30.11 ⇒ registry.aliyuncs.com/google_containers/pause:3.9

Installing kubeadm, kubelet and kubectl

# ----------------------------------------------------------------------------- # 首先,安装依赖工具

apt-get update \
    && apt-get install -y apt-transport-https ca-certificates curl

# ----------------------------------------------------------------------------- # 然后,导入仓库密钥

# 尽管下载 apt-key.gpg 存在困难,但是应该尽量从官方站点下载(请勿随意使用第三方密钥)
# curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
apt-key adv --keyserver keyserver.ubuntu.com --recv-keys  FEEA9169307EA071 8B57C5C2836F4BEB B53DC80D13EDEF05 234654DA9A296436

cat > /etc/apt/sources.list.d/kubernetes.list <<EOF
deb https://mirrors.tuna.tsinghua.edu.cn/kubernetes/core:/stable:/v1.30/deb/ /
EOF

apt-get update

# ----------------------------------------------------------------------------- # 最后,安装工具

apt-cache madison kubeadm | grep 1.30.0
apt-get install -y kubelet=1.30.0-1.1 kubeadm=1.30.0-1.1 kubectl=1.30.0-1.1 --allow-change-held-packages
apt-mark hold kubelet kubeadm kubectl

创建集群

  • One or more machines running a deb/rpm-compatible Linux OS; for example: Ubuntu or CentOS. ⇒ 已满足
  • 2 GiB or more of RAM per machine–any less leaves little room for your apps. ⇒ 已满足
  • At least 2 CPUs on the machine that you use as a control-plane node. ⇒ 已满足
  • Full network connectivity among all machines in the cluster. You can use either a public or a private network. ⇒ 已满足
  • You also need to use a version of kubeadm that can deploy the version of Kubernetes that you want to use in your new cluster. ⇒ 已满足
  • Preparing the hosts
    • Component installation ⇒ 已满足
    • Network setup ⇒ 已满足
  • Preparing the required container images ⇒ 跳过即可

Initializing your control-plane node

kubeadm init --image-repository registry.aliyuncs.com/google_containers        \
    --service-cidr="10.130.0.0/16"                                             \
    --pod-network-cidr="10.244.0.0/16"

  • Considerations about apiserver-advertise-address and ControlPlaneEndpoint ⇒ 跳过即可

Installing a Pod network add-on

  • 需要加载 br_netfilter 内核模块,否则 kube-flannel-ds-xxxxx 将产生 Failed to check br_netfilter: stat /proc/sys/net/bridge/bridge-nf-call-iptables: no such file or directory 错误。详细参考 https://github.com/flannel-io/flannel/issues/2166 页面。
cat > /etc/modules-load.d/k8s.conf <<EOF
br_netfilter
EOF
systemctl restart systemd-modules-load.service

  • 我们采用 Flannel https://github.com/flannel-io/flannel 插件并通过 Mannifest 部署:wget kube-flannel.yml ; kubectl apply -f kube-flannel.yml ;
  • 该插件也决定我们 –pod-network-cidr=10.244.0.0/16 参数,否则我们需要修改 kube-flannel.yml 文件;

Control plane node isolation

  • 针对该次部署:kubectl taint nodes –all node-role.kubernetes.io/control-plane-

检查集群

# kubectl get pods –all-namespaces

常见问题

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox “40c44e9f3924c681464d3a8d7ba021316a9c2fcffb33a9cdbe01ed0f977eeedf”: plugin type=”cilium-cni” failed (add): unable to connect to Cilium daemon: failed to create cilium agent client after 30.000000 seconds timeout: Get “http://localhost/v1/config”: dial unix /var/run/cilium/cilium.sock: connect: no such file or directory

通常是由于网络插件配置冲突或残留的 Cilium CNI 配置导致的

查看 /etc/cni/net.d/ 目录下的 CNI 配置文件。