3

关于 K8s 集群 CentOS Linux 7 节点批量 Kernel 升级的一些笔记

 1 year ago
source link: https://liruilongs.github.io/2023/02/01/%E5%BE%85%E5%8F%91%E5%B8%83/%E4%BA%8C%E6%9C%9F/%E5%85%B3%E4%BA%8ELinux%E7%B3%BB%E7%BB%9F%E5%86%85%E6%A0%B8%E5%8D%87%E7%BA%A7%E7%9A%84%E7%AC%94%E8%AE%B0%E6%95%B4%E7%90%86/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

关于 K8s 集群 CentOS Linux 7 节点批量 Kernel 升级的一些笔记

对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》


  • k8s 集群安装一个观测工具检查发现内核版本太低不支持,所有决定升级
  • 操作环境为实验环境,所以没什么顾虑
  • 如果生产环境升级,需要做错误预算哈,最好用 Velero 备份,做好集群迁移的准备
  • 高内核版本支持 cgroup2,如果新集群部署需要考虑下这块。
  • 理解不足小伙伴帮忙指正

对每个人而言,真正的职责只有一个:找到自我。然后在心中坚守其一生,全心全意,永不停息。所有其它的路都是不完整的,是人的逃避方式,是对大众理想的懦弱回归,是随波逐流,是对内心的恐惧 ——赫尔曼·黑塞《德米安》


本地的 k8s 集群,CentOS Linux 7 (Core) 的系统

┌──[[email protected]]-[~]
└─$kubectl get nodes
NAME STATUS ROLES AGE VERSION
vms100.liruilongs.github.io Ready control-plane 6d4h v1.25.1
vms101.liruilongs.github.io Ready control-plane 6d4h v1.25.1
vms102.liruilongs.github.io Ready control-plane 6d4h v1.25.1
vms103.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms105.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms106.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms107.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms108.liruilongs.github.io Ready <none> 6d4h v1.25.1
┌──[[email protected]]-[~]
└─$

内核版本 Linux 3.10.0-693.el7.x86_64

┌──[[email protected]]-[~/ansible/pixie]
└─$hostnamectl
Static hostname: vms100.liruilongs.github.io
Icon name: computer-vm
Chassis: vm
Machine ID: e93ae3f6cb354f3ba509eeb73568087e
Boot ID: 5ed408a863df48ae80b51f1b6c4be85f
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-693.el7.x86_64
Architecture: x86-64
┌──[[email protected]]-[~/ansible/pixie]
└─$

在安装一个观测工具时,提示内核版本太低

┌──[[email protected]]-[~/ansible/pixie]
└─$px deploy --check_only
Pixie CLI

Running Cluster Checks:
✕ Kernel version > 4.14.0 ERR: kernel version for node (vms100.liruilongs.github.io) not supported ✕ Kernel version > 4.14.0 ERR: kernel version for node (vms100.liruilongs.github.io) not supported. Must have minimum kernel version of (4.14.0)
Check pre-check has failed. To bypass pass in --check=false. error=kernel version for node (vms100.liruilongs.github.io) not supported. Must have minimum kernel version of (4.14.0)

决定升级内核,

这里升级方案,先升级一台机器,确认没有问题,对集群做简单测试,半小时后,如果集群运行正常,然后通过 Ansible 批量升级其他的节点。

在这里插入图片描述

在这里插入图片描述

Linux 官方内核 需要从 https://www.kernel.org/ 下载并编译安装

大多数 Linux 发行版提供自行维护的内核,可以通过 yum 、df或 rpm 等包管理系统升级。

ELRepo 是一个为Linux提供驱动程序和内核镜像的存储库,一个用于企业 Linux 软件包的 RPM 存储库。ELRepo 支持 Red Hat Enterprise Linux (RHEL) 及其重建项目.

ELRepo 项目专注于硬件相关的软件包,以增强您使用 Enterprise Linux 的体验。这包括文件系统驱动程序、图形驱动程序、网络驱动程序、声音驱动程序、网络摄像头和视频驱动程序。

ELRepo官网:http://elrepo.org/tiki/tiki-index.php

#查看 yum 中可升级的内核版本
yum list kernel --showduplicates
#如果list中有需要的版本可以直接执行 update 升级,多数是没有的,所以要按以下步骤操作

#导入ELRepo软件仓库的公共秘钥
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

#Centos7系统安装ELRepo
yum install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
#Centos8系统安装ELRepo
yum install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm

#查看ELRepo提供的内核版本
yum --disablerepo="*" --enablerepo="elrepo-kernel" list available

(以上内容来自 CSDN:https://blog.csdn.net/m0_37642477/article/details/123970790)

Kernel 升级

先找一台机器单独升级

Centos7系统安装ELRepo , 导入ELRepo软件仓库的公共秘钥

┌──[[email protected]]-[~/back]
└─$yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm
┌──[[email protected]]-[~/back]
└─$rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

查看ELRepo提供的内核版本

┌──[[email protected]]-[~/back]
└─$yum --disablerepo="*" --enablerepo="elrepo-kernel" list available
已加载插件:fastestmirror
elrepo-kernel | 3.0 kB 00:00:00
elrepo-kernel/primary_db | 2.1 MB 00:01:40
Loading mirror speeds from cached hostfile
* elrepo-kernel: ftp.yz.yamagata-u.ac.jp
可安装的软件包
kernel-lt.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-devel.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-doc.noarch 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-headers.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-tools.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-tools-libs.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-lt-tools-libs-devel.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
kernel-ml.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-devel.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-doc.noarch 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-headers.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-tools.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-tools-libs.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
kernel-ml-tools-libs-devel.x86_64 6.1.8-1.el7.elrepo elrepo-kernel
perf.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
python-perf.x86_64 5.4.230-1.el7.elrepo elrepo-kernel
┌──[[email protected]]-[~/back]
└─$
  • kernel-lt:表示longterm,即长期支持的内核;当前为5.4.
  • kernel-ml:表示mainline,即当前主线的内核;当前为5.17.

这里我们升级长期支持的版本,直接升级

#长期支持的内核
┌──[[email protected]]-[~/back]
└─$yum -y --enablerepo=elrepo-kernel install kernel-lt.x86_64

查看系统可用内核,并设置启动项

┌──[[email protected]]-[~/back]
└─$sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /etc/grub2.cfg
0 : CentOS Linux (5.4.230-1.el7.elrepo.x86_64) 7 (Core)
1 : CentOS Linux 7 Rescue e93ae3f6cb354f3ba509eeb73568087e (3.10.0-1160.83.1.el7.x86_64)
2 : CentOS Linux (3.10.0-1160.83.1.el7.x86_64) 7 (Core)
3 : CentOS Linux (3.10.0-693.el7.x86_64) 7 (Core)
4 : CentOS Linux (0-rescue-80c608ceab5342779ba1adc2ac29c213) 7 (Core)

指定开机启动内核版本

┌──[[email protected]]-[~/back]
└─$grub2-set-default 0

生成 grub 配置文件


┌──[[email protected]]-[~/back]
└─$grub2-mkconfig -o /boot/grub2/grub.cfg
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-5.4.230-1.el7.elrepo.x86_64
Found initrd image: /boot/initramfs-5.4.230-1.el7.elrepo.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-1160.83.1.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-1160.83.1.el7.x86_64.img
Found linux image: /boot/vmlinuz-3.10.0-693.el7.x86_64
Found initrd image: /boot/initramfs-3.10.0-693.el7.x86_64.img
Found linux image: /boot/vmlinuz-0-rescue-80c608ceab5342779ba1adc2ac29c213
Found initrd image: /boot/initramfs-0-rescue-80c608ceab5342779ba1adc2ac29c213.img
Found linux image: /boot/vmlinuz-0-rescue-e93ae3f6cb354f3ba509eeb73568087e
Found initrd image: /boot/initramfs-0-rescue-e93ae3f6cb354f3ba509eeb73568087e.img
done

重启系统,验证

┌──[[email protected]]-[~/back]
└─$reboot
Connection to 192.168.26.100 closed by remote host.
Connection to 192.168.26.100 closed.
....
┌──[[email protected]]-[~]
└─$hostnamectl
Static hostname: vms100.liruilongs.github.io
Icon name: computer-vm
Chassis: vm
Machine ID: e93ae3f6cb354f3ba509eeb73568087e
Boot ID: a1150b6d97dc4afbb81dae58f131a487
Virtualization: vmware
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
Architecture: x86-64
┌──[[email protected]]-[~]
└─$

确实没有问题之后,对集群做简单测试,等半个小时,批量升级一下

编写升级脚本

#!/bin/bash

#@File : update_kernel
#@Time : 2023/02/01 23:58:23
#@Author : Li Ruilong
#@Version : 1.0
#@Desc : contos 7 批量升级内核脚本
#@Contact : [email protected]

yum -y install https://www.elrepo.org/elrepo-release-7.el7.elrepo.noarch.rpm

rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org

yum -y --enablerepo=elrepo-kernel install kernel-lt.x86_64

grub2-set-default 0

grub2-mkconfig -o /boot/grub2/grub.cfg

reboot

拷贝脚本到升级节点机器

┌──[[email protected]]-[~/ansible]
└─$ansible ansible_node -m copy -a "src=./update_kernel/update_kernel.sh dest=/tmp/" -i host.yaml
┌──[[email protected]]-[~/ansible]
└─$ansible ansible_node -m shell -a "cat /tmp/update_kernel.sh" -i host.yaml

运行升级脚本

┌──[[email protected]]-[~/ansible]
└─$ansible ansible_node -m shell -a "/usr/bin/bash /tmp/update_kernel.sh" -i host.yaml -f 7 -vvv

升级完成查看内核版本确认

┌──[[email protected]]-[~/ansible]
└─$ansible ansible_node -m shell -a 'hostnamectl | grep Kernel' -i host.yaml
192.168.26.106 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.105 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.102 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.103 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.101 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.107 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64
192.168.26.108 | CHANGED | rc=0 >>
Kernel: Linux 5.4.230-1.el7.elrepo.x86_64

查看集群信息确认

┌──[[email protected]]-[~/ansible]
└─$kubectl get nodes
NAME STATUS ROLES AGE VERSION
vms100.liruilongs.github.io Ready control-plane 6d5h v1.25.1
vms101.liruilongs.github.io Ready control-plane 6d5h v1.25.1
vms102.liruilongs.github.io Ready control-plane 6d5h v1.25.1
vms103.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms105.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms106.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms107.liruilongs.github.io Ready <none> 6d4h v1.25.1
vms108.liruilongs.github.io Ready <none> 6d4h v1.25.1
┌──[[email protected]]-[~/ansible]
└─$

运行原来的工具测试

┌──[[email protected]]-[~/ansible/pixie]
└─$px deploy --check_only
Pixie CLI

Running Cluster Checks:
✔ Kernel version > 4.14.0
✔ Cluster type is supported
✔ K8s version > 1.16.0
✔ Kubectl > 1.10.0 is present
✔ User can create namespace
INFO[0002] All Required Checks Passed!
┌──[[email protected]]-[~/ansible/pixie]
└─$

博文部分内容参考

文中涉及参考链接内容版权归原作者所有,如有侵权请告知


https://www.kernel.org/

http://elrepo.org/tiki/tiki-index.php

https://blog.csdn.net/m0_37642477/article/details/123970790


© 2018-2023 [email protected], All rights reserved. 保持署名-非商用-自由转载-相同方式共享(创意共享3.0许可证)


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK