6

Kubernetes(k8s)一次性任务:Job - 人生的哲理

 1 year ago
source link: https://www.cnblogs.com/renshengdezheli/p/17450685.html
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

一.系统环境

本文主要基于Kubernetes1.21.9和Linux操作系统CentOS7.4。

服务器版本 docker软件版本 Kubernetes(k8s)集群版本 CPU架构
CentOS Linux release 7.4.1708 (Core) Docker version 20.10.12 v1.21.9 x86_64

Kubernetes集群架构:k8scloude1作为master节点,k8scloude2,k8scloude3作为worker节点。

服务器 操作系统版本 CPU架构 进程 功能描述
k8scloude1/192.168.110.130 CentOS Linux release 7.4.1708 (Core) x86_64 docker,kube-apiserver,etcd,kube-scheduler,kube-controller-manager,kubelet,kube-proxy,coredns,calico k8s master节点
k8scloude2/192.168.110.129 CentOS Linux release 7.4.1708 (Core) x86_64 docker,kubelet,kube-proxy,calico k8s worker节点
k8scloude3/192.168.110.128 CentOS Linux release 7.4.1708 (Core) x86_64 docker,kubelet,kube-proxy,calico k8s worker节点

Kubernetes是一个流行的容器编排平台,它为运行容器化应用程序提供了丰富的功能和工具。其中之一就是Kubernetes Job,它允许您在集群中运行一次性任务。

正常情况下,Kubernetes的工作负载控制器会尝试持续地将容器保持在运行状态,比如deploymentDaemonSetReplicationControllerReplicaSet 。但有时候,您可能需要运行短暂的任务或批处理作业,这些任务只需要运行一次并完成特定的操作。这时,就可以使用Kubernetes Job来管理这些一次性任务。

使用job任务的前提是已经有一套可以正常运行的Kubernetes集群,关于Kubernetes(k8s)集群的安装部署,可以查看博客《Centos7 安装部署Kubernetes(k8s)集群》https://www.cnblogs.com/renshengdezheli/p/16686769.html。

三.Kubernetes Job简介

Kubernetes Job是一个控制器,它可以创建和管理一次性任务。当Job被创建时,它将自动创建一个或多个Pod来运行任务,直到任务成功完成为止。

与其他控制器不同,Job控制器负责确保Pod成功地完成任务后删除它们。这意味着如果任务失败或调度失败(例如,由于资源不足),Kubernetes将自动重新启动失败的Pod,直到任务成功完成为止。

四.创建一次性任务job

4.1 创建一个简单任务的job

创建job的yaml文件存放目录

[root@k8scloude1 ~]# mkdir jobandcronjob
[root@k8scloude1 ~]# cd jobandcronjob/

创建namespace

[root@k8scloude1 jobandcronjob]# kubectl create ns job
namespace/job created

切换namespace为job

[root@k8scloude1 jobandcronjob]# kubens job
Context "kubernetes-admin@kubernetes" modified.
Active namespace is "job".

查看job任务

[root@k8scloude1 jobandcronjob]# kubectl get job
No resources found in job namespace.

[root@k8scloude1 jobandcronjob]# kubectl get pod
No resources found in job namespace.

查看创建job的帮助

[root@k8scloude1 jobandcronjob]# kubectl create job --help
Create a job with the specified name.
Examples:
  # Create a job
  kubectl create job my-job --image=busybox
  
  # Create a job with command
  kubectl create job my-job --image=busybox -- date
  
  # Create a job from a CronJob named "a-cronjob"
  kubectl create job test-job --from=cronjob/a-cronjob
......
Usage:
  kubectl create job NAME --image=image [--from=cronjob/name] -- [COMMAND] [args...] [options]

Use "kubectl options" for a list of global command-line options (applies to all commands).

生成创建job的yaml文件,注意:重启策略restartPolicy为Never,这表示如果容器退出,则Job将不会自动重启它。

yaml文件意思为:创建名为my-job的任务,它使用busybox镜像来创建一个容器,并将其用于执行任务。不过任务为空。

[root@k8scloude1 jobandcronjob]# kubectl create job my-job --image=busybox --dry-run=client -o yaml >myjob.yaml

[root@k8scloude1 jobandcronjob]# cat myjob.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      containers:
      - image: busybox
        name: my-job
        resources: {}
      restartPolicy: Never
status: {}

生成创建job的yaml文件,并执行命令"date -F ; sleep 10",一次性任务job就是date -F ; sleep 10输出当前时间并休眠10秒钟

[root@k8scloude1 jobandcronjob]# kubectl create job my-job --image=busybox --dry-run=client -o yaml -- sh -c "date ; sleep 10" >myjob.yaml

[root@k8scloude1 jobandcronjob]# cat myjob.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      containers:
      - command:
        - sh
        - -c
        - date ; sleep 10
        image: busybox
        name: my-job
        resources: {}
      #重启策略restartPolicy为Never,这表示如果容器退出,则Job将不会自动重启它。  
      restartPolicy: Never
status: {}

修改yaml文件并创建。

创建一个名为my-job的Job,并在其中定义一个容器,该容器将输出当前时间并休眠10秒钟。当my-job被创建时,Kubernetes将自动创建Pod来运行容器,并在容器成功完成任务后删除它们。

[root@k8scloude1 jobandcronjob]# vim myjob.yaml 

[root@k8scloude1 jobandcronjob]# cat myjob.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      #terminationGracePeriodSeconds属性为0,这意味着当Pod被终止时,Kubernetes将立即杀死容器而不等待容器完成处理。这可以防止容器在超时之前卡住或处于锁定状态。
      terminationGracePeriodSeconds: 0
      containers:
      - command:
        - sh
        - -c
        - date ; sleep 10
        image: busybox
        #imagePullPolicy属性设置为IfNotPresent,这意味着如果本地没有缓存的镜像,则从远程拉取镜像。如果本地已经有了镜像,则直接使用本地镜像而不去远程拉取。
        imagePullPolicy: IfNotPresent
        name: my-job
        resources: {}
      #重启策略restartPolicy为Never,这表示如果容器退出,则Job将不会自动重启它。    
      restartPolicy: Never
status: {}

#生成job任务
[root@k8scloude1 jobandcronjob]# kubectl apply -f myjob.yaml 
job.batch/my-job created

查看job任务,观察pod的状态。

可以发现my-job需要完成一次任务,任务完成之后,pod的状态变为Completed,job的COMPLETIONS变为1/1

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   0/1           6s         6s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE
my-job-pgnf2   1/1     Running   0          8s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS      RESTARTS   AGE
my-job-pgnf2   0/1     Completed   0          11s

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   1/1           12s        2m17s

删除job任务

[root@k8scloude1 jobandcronjob]# kubectl delete job my-job 
job.batch "my-job" deleted

4.2 创建需要执行多次的job任务

某些时候,任务需要确保被完成多次,才能确认任务被正常完成,这时候可以使用completions设置job的完成次数。

修改yaml文件。

有几个重要的参数要解释下:

  • backoffLimit: 4 如果Job失败,则重试4次。
  • completions: 6 Job结束需要成功运行的Pod个数,即状态为Completed的Pod数。在本例中,这意味着当有6个Pod完成任务后,Job将终止。
  • parallelism: 2 一次性运行2个Pod,这意味着Kubernetes将同时启动2个Pod来并行执行任务。

如下yaml文件表示:创建一个名为my-job的Job对象,并在其中定义一个容器,该容器将输出当前时间并休眠10秒钟。当my-job被创建时,Kubernetes将自动创建2个Pod来运行容器,并在容器成功完成任务后删除它们。如果其中任何一个Pod失败,则Kubernetes将尝试重试该Pod,最多重试4次。在所有6个Pod成功运行并完成任务后,Job将终止。

[root@k8scloude1 jobandcronjob]# vim myjobparallelism.yaml 

[root@k8scloude1 jobandcronjob]# cat myjobparallelism.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  #如果job失败,则重试4次
  backoffLimit: 4
  #job结束需要成功运行的Pod个数,即状态为Completed的pod数
  completions: 6
  #一次性运行2个pod
  parallelism: 2
  template:
    metadata:
      creationTimestamp: null
    spec:
      terminationGracePeriodSeconds: 0
      containers:
      - command:
        - sh
        - -c
        - date ; sleep 10
        image: busybox
        imagePullPolicy: IfNotPresent
        name: my-job
        resources: {}
      #restartPolicy属性为Never,这表示如果容器退出,则Job将不会自动重启它。  
      restartPolicy: Never
status: {}

创建job

[root@k8scloude1 jobandcronjob]# kubectl apply -f myjobparallelism.yaml 
job.batch/my-job created

查看job及其pod状态,可以看到job完成了6次。

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   0/6           5s         5s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS    RESTARTS   AGE
my-job-dlrls   1/1     Running   0          7s
my-job-dwf4v   1/1     Running   0          7s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS              RESTARTS   AGE
my-job-6mtnm   1/1     Running             0          2s
my-job-dlrls   0/1     Completed           0          14s
my-job-dwf4v   0/1     Completed           0          14s
my-job-tvvz8   0/1     ContainerCreating   0          2s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS      RESTARTS   AGE
my-job-6mtnm   1/1     Running     0          9s
my-job-dlrls   0/1     Completed   0          21s
my-job-dwf4v   0/1     Completed   0          21s
my-job-tvvz8   1/1     Running     0          9s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS      RESTARTS   AGE
my-job-6mtnm   0/1     Completed   0          47s
my-job-dlrls   0/1     Completed   0          59s
my-job-dwf4v   0/1     Completed   0          59s
my-job-slkxf   0/1     Completed   0          35s
my-job-tvvz8   0/1     Completed   0          47s
my-job-zj5sx   0/1     Completed   0          36s

#此job完成了6次
[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   6/6           35s        62s

删除job

[root@k8scloude1 jobandcronjob]# kubectl delete job my-job 
job.batch "my-job" deleted

五.测试job失败重试次数

backoffLimit: 4 表示如果Job失败,则重试4次。我们要测试下job失败之后,是不是真的会重试4次。

修改yaml文件,这次故意执行错误的指令,sleep改为sleepx,查看重试次数

job的重启策略:Nerver ,只要任务没有完成,就创建新pod运行,直到job完成 ,会产生多个pod; 只要job没有完成,就会重启pod,直到job完成。

[root@k8scloude1 jobandcronjob]# vim myjobparallelism.yaml 

[root@k8scloude1 jobandcronjob]# cat myjobparallelism.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  #如果job失败,则重试4次
  backoffLimit: 4
  #job结束需要成功运行的Pod个数,即状态为Completed的pod数
  completions: 6
  #一次性运行2个pod
  parallelism: 2
  template:
    metadata:
      creationTimestamp: null
    spec:
      terminationGracePeriodSeconds: 0
      containers:
      - command:
        - sh
        - -c
        - date ; sleepx 10
        image: busybox
        imagePullPolicy: IfNotPresent
        name: my-job
        resources: {}
      restartPolicy: Never
status: {}

创建job

[root@k8scloude1 jobandcronjob]# kubectl apply -f myjobparallelism.yaml 
job.batch/my-job created

查看job及其pod状态,可以看到生成了6个失败pod。

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   0/6           7s         7s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS   RESTARTS   AGE
my-job-52xln   0/1     Error    0          11s
my-job-kn6d5   0/1     Error    0          10s
my-job-plh2s   0/1     Error    0          11s
my-job-rrvwz   0/1     Error    0          9s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS   RESTARTS   AGE
my-job-52xln   0/1     Error    0          27s
my-job-9hqwn   0/1     Error    0          16s
my-job-kn6d5   0/1     Error    0          26s
my-job-n2rhz   0/1     Error    0          16s
my-job-plh2s   0/1     Error    0          27s
my-job-rrvwz   0/1     Error    0          25s

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   0/6           31s        31s

使用kubectl describe job查看job的描述信息,发现job已经达到重试次数“Warning BackoffLimitExceeded 84s job-controller Job has reached the specified backoff limit”。

注意backoffLimit重试的次数并不准,backoffLimit重试次数为4次,每次并行为2,所以应该是8次,但是只创建了6个pod就reached the specified backoff limit

[root@k8scloude1 jobandcronjob]# kubectl describe job my-job 
Name:           my-job
Namespace:      job
Selector:       controller-uid=cbd4c4b9-d31d-420a-a3d7-c3e1680de96c
Labels:         controller-uid=cbd4c4b9-d31d-420a-a3d7-c3e1680de96c
                job-name=my-job
Annotations:    <none>
Parallelism:    2
......
Events:
  Type     Reason                Age    From            Message
  ----     ------                ----   ----            -------
  Normal   SuccessfulCreate      2m15s  job-controller  Created pod: my-job-plh2s
  Normal   SuccessfulCreate      2m15s  job-controller  Created pod: my-job-52xln
  Normal   SuccessfulCreate      2m14s  job-controller  Created pod: my-job-kn6d5
  Normal   SuccessfulCreate      2m13s  job-controller  Created pod: my-job-rrvwz
  Normal   SuccessfulCreate      2m4s   job-controller  Created pod: my-job-9hqwn
  Normal   SuccessfulCreate      2m4s   job-controller  Created pod: my-job-n2rhz
  Warning  BackoffLimitExceeded  84s    job-controller  Job has reached the specified backoff limit

删除job

[root@k8scloude1 jobandcronjob]# kubectl delete job my-job 
job.batch "my-job" deleted

[root@k8scloude1 jobandcronjob]# kubectl get job
No resources found in job namespace.

六.job任务使用示例:计算圆周率

下面使用job任务计算圆周率。

本次用到perl语言,使用perl计算圆周率2000位。先提前拉取perl镜像。

先在worker节点下载perl镜像。

[root@k8scloude2 ~]# docker pull hub.c.163.com/library/perl:latest

[root@k8scloude3 ~]# docker pull hub.c.163.com/library/perl:latest

创建job任务,任务为perl -Mbignum=bpi -wle 'print bpi(2000)使用perl计算圆周率2000位

[root@k8scloude1 jobandcronjob]# cat myjobPI.yaml 
apiVersion: batch/v1
kind: Job
metadata:
  creationTimestamp: null
  name: my-job
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      terminationGracePeriodSeconds: 0
      #"perl -Mbignum=bpi -wle 'print bpi(2000)'",表示容器将输出2000位精度的圆周率(π)值。
      containers:
      - command:
        - sh
        - -c
        - perl -Mbignum=bpi -wle 'print bpi(2000)'
        image: hub.c.163.com/library/perl
        imagePullPolicy: IfNotPresent
        name: my-job
        resources: {}
      restartPolicy: Never
status: {}

创建job

[root@k8scloude1 jobandcronjob]# kubectl apply -f myjobPI.yaml 
job.batch/my-job created

查看job及其pod状态

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   0/1           4s         4s

[root@k8scloude1 jobandcronjob]# kubectl get pod
NAME           READY   STATUS      RESTARTS   AGE
my-job-m89bb   0/1     Completed   0          8s

[root@k8scloude1 jobandcronjob]# kubectl get job
NAME     COMPLETIONS   DURATION   AGE
my-job   1/1           7s         12s

查看pod日志,可以看到PI的值,显示了PI 2000位的数值。

[root@k8scloude1 jobandcronjob]# kubectl logs my-job-m89bb
3.1415926535897932384626433832795028841971693993751058209749445923078164062862089986280348253421170679821480865132823066470938446095505822317253594081284811174502841027019385211055596446229489549303819644288109756659334461284756482337867831652712019091456485669234603486104543266482133936072602491412737245870066063155881748815209209628292540917153643678925903600113305305488204665213841469519415116094330572703657595919530921861173819326117931051185480744623799627495673518857527248912279381830119491298336733624406566430860213949463952247371907021798609437027705392171762931767523846748184676694051320005681271452635608277857713427577896091736371787214684409012249534301465495853710507922796892589235420199561121290219608640344181598136297747713099605187072113499999983729780499510597317328160963185950244594553469083026425223082533446850352619311881710100031378387528865875332083814206171776691473035982534904287554687311595628638823537875937519577818577805321712268066130019278766111959092164201989380952572010654858632788659361533818279682303019520353018529689957736225994138912497217752834791315155748572424541506959508295331168617278558890750983817546374649393192550604009277016711390098488240128583616035637076601047101819429555961989467678374494482553797747268471040475346462080466842590694912933136770289891521047521620569660240580381501935112533824300355876402474964732639141992726042699227967823547816360093417216412199245863150302861829745557067498385054945885869269956909272107975093029553211653449872027559602364806654991198818347977535663698074265425278625518184175746728909777727938000816470600161452491921732172147723501414419735685481613611573525521334757418494684385233239073941433345477624168625189835694855620992192221842725502542568876717904946016534668049886272327917860857843838279679766814541009538837863609506800642251252051173929848960841284886269456042419652850222106611863067442786220391949450471237137869609563643719172874677646575739624138908658326459958133904780275898

删除job

[root@k8scloude1 jobandcronjob]# kubectl delete job my-job 
job.batch "my-job" deleted

[root@k8scloude1 jobandcronjob]# kubectl get job
No resources found in job namespace.

PS:在某些情况下,我们可能希望将超时限制放在任务上,以避免长时间运行或卡死的情况。为此,Job支持一个activeDeadlineSeconds属性,该属性指定了任务的超时时间。如果任务超时,则Kubernetes将自动终止任务

在本文中,我们介绍了Kubernetes Job的基本概念,使用方法,应用实例,包括如何创建和管理一次性任务,需要多次执行的job任务,使用job计算圆周率等。

通过使用Kubernetes Job,您可以轻松地管理一次性任务,从而使集群更加灵活、可靠和高效。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK