Kubernetes 切换到 Containerd
source link: https://mritd.com/2021/05/29/use-containerd-with-kubernetes/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
一、环境准备
- Ubuntu 20.04 x5
- Etcd 3.4.16
- Kubernetes 1.21.1
- Containerd 1.3.3
1.1、处理 IPVS
由于 Kubernetes 新版本 Service 实现切换到 IPVS,所以需要确保内核加载了 IPVS modules;以下命令将设置系统启动自动加载 IPVS 相关模块,执行完成后需要重启。
# Kernel modules
cat > /etc/modules-load.d/50-kubernetes.conf <<EOF
# Load some kernel modules needed by kubernetes at boot
nf_conntrack
br_netfilter
ip_vs
ip_vs_lc
ip_vs_wlc
ip_vs_rr
ip_vs_wrr
ip_vs_lblc
ip_vs_lblcr
ip_vs_dh
ip_vs_sh
ip_vs_fo
ip_vs_nq
ip_vs_sed
EOF
# sysctl
cat > /etc/sysctl.d/50-kubernetes.conf <<EOF
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-iptables=1
net.bridge.bridge-nf-call-ip6tables=1
fs.inotify.max_user_watches=525000
EOF
重启完成后务必检查相关 module 加载以及内核参数设置:
# check ipvs modules
➜ ~ lsmod | grep ip_vs
ip_vs_sed 16384 0
ip_vs_nq 16384 0
ip_vs_fo 16384 0
ip_vs_sh 16384 0
ip_vs_dh 16384 0
ip_vs_lblcr 16384 0
ip_vs_lblc 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs_wlc 16384 0
ip_vs_lc 16384 0
ip_vs 155648 22 ip_vs_wlc,ip_vs_rr,ip_vs_dh,ip_vs_lblcr,ip_vs_sh,ip_vs_fo,ip_vs_nq,ip_vs_lblc,ip_vs_wrr,ip_vs_lc,ip_vs_sed
nf_conntrack 139264 1 ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
libcrc32c 16384 5 nf_conntrack,btrfs,xfs,raid456,ip_vs
# check sysctl
➜ ~ sysctl -a | grep ip_forward
net.ipv4.ip_forward = 1
net.ipv4.ip_forward_update_priority = 1
net.ipv4.ip_forward_use_pmtu = 0
➜ ~ sysctl -a | grep bridge-nf-call
net.bridge.bridge-nf-call-arptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
1.2、安装 Containerd
Containerd 在 Ubuntu 20 中已经在默认官方仓库中包含,所以只需要 apt 安装即可:
# 其他软件包后面可能会用到,所以顺手装了
apt install containerd bridge-utils nfs-common tree -y
安装成功后可以通过执行 ctr images ls
命令验证,本章节不会对 Containerd 配置做说明,Containerd 配置文件将在 Kubernetes 安装时进行配置。
二、安装 kubernetes
2.1、安装 Etcd 集群
Etcd 对于 Kubernetes 来说是核心中的核心,所以个人还是比较喜欢在宿主机安装;宿主机安装情况下为了方便我打包了一些 *-pack
的工具包,用于快速处理:
安装 CFSSL 和 ETCD
# 下载安装包
wget https://github.com/mritd/etcd-pack/releases/download/v3.4.16/etcd_v3.4.16.run
wget https://github.com/mritd/cfssl-pack/releases/download/v1.5.0/cfssl_v1.5.0.run
# 安装 cfssl 和 etcd
chmod +x *.run
./etcd_v3.4.16.run install
./cfssl_v1.5.0.run install
安装完成后,自行调整 /etc/cfssl/etcd/etcd-csr.json
相关 IP,然后执行同目录下 create.sh
生成证书。
➜ ~ cat /etc/cfssl/etcd/etcd-csr.json
{
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"O": "etcd",
"OU": "etcd Security",
"L": "Beijing",
"ST": "Beijing",
"C": "CN"
}
],
"CN": "etcd",
"hosts": [
"127.0.0.1",
"localhost",
"*.etcd.node",
"*.kubernetes.node",
"10.0.0.11",
"10.0.0.12",
"10.0.0.13"
]
}
# 复制到 3 台 master
➜ ~ for ip in `seq 1 3`; do scp /etc/cfssl/etcd/*.pem [email protected]$ip:/etc/etcd/ssl; done
证书生成完成后调整每台机器的 Etcd 配置文件,然后修复权限启动。
# 复制配置
for ip in `seq 1 3`; do scp /etc/etcd/etcd.cluster.yaml [email protected]$ip:/etc/etcd/etcd.yaml; done
# 修复权限
for ip in `seq 1 3`; do ssh [email protected]$ip chown -R etcd:etcd /etc/etcd; done
# 每台机器启动
systemctl start etcd
启动完成后通过 etcdctl
验证集群状态:
# 稳妥点应该执行 etcdctl endpoint health
➜ ~ etcdctl member list
55fcbe0adaa45350, started, etcd3, https://10.0.0.13:2380, https://10.0.0.13:2379, false
cebdf10928a06f3c, started, etcd1, https://10.0.0.11:2380, https://10.0.0.11:2379, false
f7a9c20602b8532e, started, etcd2, https://10.0.0.12:2380, https://10.0.0.12:2379, false
2.2、安装 kubeadm
kubeadm 国内用户建议使用 aliyun 的安装源:
# kubeadm
apt-get install -y apt-transport-https
curl https://mirrors.aliyun.com/kubernetes/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb https://mirrors.aliyun.com/kubernetes/apt/ kubernetes-xenial main
EOF
apt update
# ebtables、ethtool kubelet 可能会用,具体忘了,反正从官方文档上看到的
apt install kubelet kubeadm kubectl ebtables ethtool -y
2.3、安装 kube-apiserver-proxy
kube-apiserver-proxy 是我自己编译的一个仅开启四层代理的 Nginx,其主要负责监听 127.0.0.1:6443
并负载到所有的 Api Server 地址(0.0.0.0:5443
):
wget https://github.com/mritd/kube-apiserver-proxy-pack/releases/download/v1.20.0/kube-apiserver-proxy_v1.20.0.run
chmod +x *.run
./kube-apiserver-proxy_v1.20.0.run install
安装完成后根据 IP 地址不同自行调整 Nginx 配置文件,然后启动:
➜ ~ cat /etc/kubernetes/apiserver-proxy.conf
error_log syslog:server=unix:/dev/log notice;
worker_processes auto;
events {
multi_accept on;
use epoll;
worker_connections 1024;
}
stream {
upstream kube_apiserver {
least_conn;
server 10.0.0.11:5443;
server 10.0.0.12:5443;
server 10.0.0.13:5443;
}
server {
listen 0.0.0.0:6443;
proxy_pass kube_apiserver;
proxy_timeout 10m;
proxy_connect_timeout 1s;
}
}
systemctl start kube-apiserver-proxy
2.4、安装 kubeadm-config
kubeadm-config 是一系列配置文件的组合以及 kubeadm 安装所需的必要镜像文件的打包,安装完成后将会自动配置 Containerd、ctrictl 等:
wget https://github.com/mritd/kubeadm-config-pack/releases/download/v1.21.1/kubeadm-config_v1.21.1.run
chmod +x *.run
# --load 选项用于将 kubeadm 所需镜像 load 到 containerd 中
./kubeadm-config_v1.21.1.run install --load
2.4.1、containerd 配置
Containerd 配置位于 /etc/containerd/config.toml
,其配置如下:
version = 2
# 指定存储根目录
root = "/data/containerd"
state = "/run/containerd"
# OOM 评分
oom_score = -999
[grpc]
address = "/run/containerd/containerd.sock"
[metrics]
address = "127.0.0.1:1234"
[plugins]
[plugins."io.containerd.grpc.v1.cri"]
# sandbox 镜像
sandbox_image = "k8s.gcr.io/pause:3.4.1"
[plugins."io.containerd.grpc.v1.cri".containerd]
snapshotter = "overlayfs"
default_runtime_name = "runc"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
runtime_type = "io.containerd.runc.v2"
# 开启 systemd cgroup
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true
2.4.2、crictl 配置
在切换到 Containerd 以后意味着以前的 docker
命令将不再可用,containerd 默认自带了一个 ctr
命令,同时 CRI 规范会自带一个 crictl
命令;crictl
命令配置文件存放在 /etc/crictl.yaml
中:
runtime-endpoint: unix:///run/containerd/containerd.sock
image-endpoint: unix:///run/containerd/containerd.sock
pull-image-on-create: true
2.4.3、kubeadm 配置
kubeadm 配置目前分为 2 个,一个是用于首次引导启动的 init 配置,另一个是用于其他节点 join 到 master 的配置;其中比较重要的 init 配置如下:
# /etc/kubernetes/kubeadm.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: InitConfiguration
# kubeadm token create
bootstrapTokens:
- token: "c2t0rj.cofbfnwwrb387890"
nodeRegistration:
# CRI 地址(Containerd)
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
localAPIEndpoint:
advertiseAddress: "10.0.0.11"
bindPort: 5443
# kubeadm certs certificate-key
certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
clusterName: "kuberentes"
kubernetesVersion: "v1.21.1"
certificatesDir: "/etc/kubernetes/pki"
# Other components of the current control plane only connect to the apiserver on the current host.
# This is the expected behavior, see: https://github.com/kubernetes/kubeadm/issues/2271
controlPlaneEndpoint: "127.0.0.1:6443"
etcd:
external:
endpoints:
- "https://10.0.0.11:2379"
- "https://10.0.0.12:2379"
- "https://10.0.0.13:2379"
caFile: "/etc/etcd/ssl/etcd-ca.pem"
certFile: "/etc/etcd/ssl/etcd.pem"
keyFile: "/etc/etcd/ssl/etcd-key.pem"
networking:
serviceSubnet: "10.66.0.0/16"
podSubnet: "10.88.0.1/16"
dnsDomain: "cluster.local"
apiServer:
extraArgs:
v: "4"
alsologtostderr: "true"
# audit-log-maxage: "21"
# audit-log-maxbackup: "10"
# audit-log-maxsize: "100"
# audit-log-path: "/var/log/kube-audit/audit.log"
# audit-policy-file: "/etc/kubernetes/audit-policy.yaml"
authorization-mode: "Node,RBAC"
event-ttl: "720h"
runtime-config: "api/all=true"
service-node-port-range: "30000-50000"
service-cluster-ip-range: "10.66.0.0/16"
# insecure-bind-address: "0.0.0.0"
# insecure-port: "8080"
# The fraction of requests that will be closed gracefully(GOAWAY) to prevent
# HTTP/2 clients from getting stuck on a single apiserver.
goaway-chance: "0.001"
# extraVolumes:
# - name: "audit-config"
# hostPath: "/etc/kubernetes/audit-policy.yaml"
# mountPath: "/etc/kubernetes/audit-policy.yaml"
# readOnly: true
# pathType: "File"
# - name: "audit-log"
# hostPath: "/var/log/kube-audit"
# mountPath: "/var/log/kube-audit"
# pathType: "DirectoryOrCreate"
certSANs:
- "*.kubernetes.node"
- "10.0.0.11"
- "10.0.0.12"
- "10.0.0.13"
timeoutForControlPlane: 1m
controllerManager:
extraArgs:
v: "4"
node-cidr-mask-size: "19"
deployment-controller-sync-period: "10s"
experimental-cluster-signing-duration: "8670h"
node-monitor-grace-period: "20s"
pod-eviction-timeout: "2m"
terminated-pod-gc-threshold: "30"
scheduler:
extraArgs:
v: "4"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
oomScoreAdj: -900
cgroupDriver: "systemd"
kubeletCgroups: "/system.slice/kubelet.service"
nodeStatusUpdateFrequency: 5s
rotateCertificates: true
evictionSoft:
"imagefs.available": "15%"
"memory.available": "512Mi"
"nodefs.available": "15%"
"nodefs.inodesFree": "10%"
evictionSoftGracePeriod:
"imagefs.available": "3m"
"memory.available": "1m"
"nodefs.available": "3m"
"nodefs.inodesFree": "1m"
evictionHard:
"imagefs.available": "10%"
"memory.available": "256Mi"
"nodefs.available": "10%"
"nodefs.inodesFree": "5%"
evictionMaxPodGracePeriod: 30
imageGCLowThresholdPercent: 70
imageGCHighThresholdPercent: 80
kubeReserved:
"cpu": "500m"
"memory": "512Mi"
"ephemeral-storage": "1Gi"
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
# kube-proxy specific options here
clusterCIDR: "10.88.0.1/16"
mode: "ipvs"
oomScoreAdj: -900
ipvs:
minSyncPeriod: 5s
syncPeriod: 5s
scheduler: "wrr"
init 配置具体含义请自行参考官方文档,相对于 init 配置,join 配置比较简单,不过需要注意的是如果需要 join 为 master 则需要 controlPlane
这部分,否则请注释掉 controlPlane
。
# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
controlPlane:
localAPIEndpoint:
advertiseAddress: "10.0.0.12"
bindPort: 5443
certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
bootstrapToken:
apiServerEndpoint: "127.0.0.1:6443"
token: "c2t0rj.cofbfnwwrb387890"
# Please replace with the "--discovery-token-ca-cert-hash" value printed
# after the kubeadm init command is executed successfully
caCertHashes:
- "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
2.5、拉起 master
在调整好配置后,拉起 master 节点只需要一条命令:
kubeadm init --config /etc/kubernetes/kubeadm.yaml --upload-certs --ignore-preflight-errors=Swap
拉起完成后记得保存相关 Token 以便于后续使用。
2.6、拉起其他 master
在第一个 master 启动完成后,使用 join
命令让其他 master 加入即可;需要注意的是 kubeadm-join.yaml
配置中需要替换 caCertHashes
为第一个 master 拉起后的 discovery-token-ca-cert-hash
的值。
kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
2.7、拉起其他 node
node 节点拉起与拉起其他 master 节点一样,唯一不同的是需要注释掉配置中的 controlPlane
部分。
# /etc/kubernetes/kubeadm-join.yaml
apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
#controlPlane:
# localAPIEndpoint:
# advertiseAddress: "10.0.0.12"
# bindPort: 5443
# certificateKey: 31f1e534733a1607e5ba67b2834edd3a7debba41babb1fac1bee47072a98d88b
discovery:
bootstrapToken:
apiServerEndpoint: "127.0.0.1:6443"
token: "c2t0rj.cofbfnwwrb387890"
# Please replace with the "--discovery-token-ca-cert-hash" value printed
# after the kubeadm init command is executed successfully
caCertHashes:
- "sha256:97590810ae34a82501717e33acfca76f16044f1a365c5ad9a1c66433c386c75c"
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock
kubeletExtraArgs:
runtime-cgroups: "/system.slice/containerd.service"
rotate-server-certificates: "true"
kubeadm join 127.0.0.1:6443 --config /etc/kubernetes/kubeadm-join.yaml --ignore-preflight-errors=Swap
2.8、其他处理
由于 kubelet 开启了证书轮转,所以新集群会有大量 csr 请求,批量允许即可:
kubectl get csr | grep Pending | awk '{print $1}' | xargs kubectl certificate approve
同时为了 master 节点也能负载 pod,需要调整污点:
kubectl taint nodes --all node-role.kubernetes.io/master-
后续 CNI 等不在本文内容范围内。
三、Containerd 常用操作
# 列出镜像
ctr images ls
# 列出 k8s 镜像
ctr -n k8s.io images ls
# 导入镜像
ctr -n k8s.io images import xxxx.tar
# 导出镜像
ctr -n k8s.io images export kube-scheduler.tar k8s.gcr.io/kube-scheduler:v1.21.1
四、资源仓库
本文中所有 *-pack
仓库地址如下:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK