7

高版本client-go在serviceAccountToken下invalid bearer token?

 7 months ago
source link: https://zhangguanzhang.github.io/2024/02/06/serviceAccountToken-valid/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

高版本client-go在serviceAccountToken下invalid bearer token?



字数统计: 1.9k阅读时长: 10 min
 2024/02/06  120  Share

1.27.4 k8s 下,coredns flannel 均报错401,kube-apiserver 刷 "Unable to authenticate the request" err="[invalid bearer token, service account token is not valid yet]

今天同事说他的 pod 无法操作 k8s 集群资源了,查看了下,发现 flannel coredns 均报错 401

[ERROR] plugin/kubernetes: pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: Unauthorized

kube-apiserver 刷日志:

authentication.go:70] "Unable to authenticate the request" err="[invalid bearer token, service account token is not valid yet]"

搜了下,为了提升安全性,Kubernetes 社区在 1.21 版本默认启动了 BoundServiceAccountTokenVolume 功能。该功能设定了 ServiceAccount Token 的过期时间、自动刷新机制和删除 Pod 后的 Token 失效机制。client-go 客户端版本大于等于 v11.0.0 和 v0.15.0 时,kubelet 组件会定时刷新 Token,定时从挂载盘重新加载Token,实现Token更新。为了保持向前兼容,过期后,您可以继续使用该Token一年。但一年后,如果 client-go 版本小于 v11.0.0 或者 v0.15.0 ,系统不会自动重新加载并更新Token,导致Token过期风险。

但是我们 k8s 集群是 1.27 了,并且 coredns flannel 的 client-go 版本都很新,而且其他集群环境都没出现。

同事说他重启了一个 coredns 的 pod 后,这个 coredns pod 就没刷 401 了,大概看了下 ServiceAccountToken 的文档,就是 kubelet 给 pod 挂载的:

...
volumes:
- name: token-vol
projected:
sources:
- serviceAccountToken:
audience: api
expirationSeconds: 3600
path: token

查看下 kubelet 日志,发现一直刷:

cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606940341356 - 1711900953406532664)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606941261574 - 1711900953467321378)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606942318471 - 1711900953423406042)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606943293769 - 1711900953450007891)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606944280584 - 1711900953461851576)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606945180585 - 1711900953427074341)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606946762456 - 1711900953414233495)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606947733070 - 1711900953456650293)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606948747910 - 1711900953430313306)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606949808485 - 1711900953436742140)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606951948415 - 1711900953479775885)"
cri_stats_provider.go:757] "Failed updating cpu usage nano core" err="zero or negative interval (1707196606952993367 - 1711900953458452657)"

在 k8s 源码搜了下 zero or negative interval 找到是下面这个函数触发的:

// https://github.com/kubernetes/kubernetes/blob/v1.27.4/pkg/kubelet/stats/cri_stats_provider.go#L715-L760
func (p *criStatsProvider) getAndUpdateContainerUsageNanoCores(stats *runtimeapi.ContainerStats) *uint64 {
if stats == nil || stats.Attributes == nil || stats.Cpu == nil {
return nil
}
// Bypass the cache if the CRI implementation specified the UsageNanoCores.
if stats.Cpu.UsageNanoCores != nil {
return &stats.Cpu.UsageNanoCores.Value
}
// If there is no UsageNanoCores, nor UsageCoreNanoSeconds, there is no information to use
if stats.Cpu.UsageCoreNanoSeconds == nil {
return nil
}
id := stats.Attributes.Id
usage, err := func() (*uint64, error) {
p.mutex.Lock()
defer p.mutex.Unlock()

cached, ok := p.cpuUsageCache[id]
if !ok || cached.stats.UsageCoreNanoSeconds == nil || stats.Cpu.UsageCoreNanoSeconds.Value < cached.stats.UsageCoreNanoSeconds.Value {
// Cannot compute the usage now, but update the cached stats anyway
p.cpuUsageCache[id] = &cpuUsageRecord{stats: stats.Cpu, usageNanoCores: nil}
return nil, nil
}

newStats := stats.Cpu
cachedStats := cached.stats
nanoSeconds := newStats.Timestamp - cachedStats.Timestamp
if nanoSeconds <= 0 {
return nil, fmt.Errorf("zero or negative interval (%v - %v)", newStats.Timestamp, cachedStats.Timestamp)
}
usageNanoCores := uint64(float64(newStats.UsageCoreNanoSeconds.Value-cachedStats.UsageCoreNanoSeconds.Value) /
float64(nanoSeconds) * float64(time.Second/time.Nanosecond))

// Update cache with new value.
usageToUpdate := usageNanoCores
p.cpuUsageCache[id] = &cpuUsageRecord{stats: newStats, usageNanoCores: &usageToUpdate}

return &usageNanoCores, nil
}()

if err != nil {
// This should not happen. Log now to raise visibility
klog.ErrorS(err, "Failed updating cpu usage nano core")
}
return usage
}

大致逻辑就是调用 cri 的 grpc ContainerStats 接口,获取容器返回的 cpu.Timestamp ,如果 cpuUsageCache[id] 第一次为空,就此次获取存储了,下次请求后,对比上次存储的,下次减去上次得到 nanoSeconds ,错误就是走到了 nanoSeconds <= 0 的里面了。

我们是使用的 cri-dockerd + docker 的,写个 grpc 调用 ContainerStats 试试:

package main

import (
"context"
"encoding/json"
"fmt"
"log"
"os"

"google.golang.org/grpc"
runtimeapi "k8s.io/cri-api/pkg/apis/runtime/v1"
)

func main() {
// 创建 gRPC 连接
conn, err := grpc.Dial("unix:///var/run/cri-dockerd.sock", grpc.WithInsecure())
if err != nil {
log.Fatalf("Failed to connect: %v", err)
}
defer conn.Close()

// 创建 CRI 客户端
criClient := runtimeapi.NewRuntimeServiceClient(conn)

// 调用 ContainerStats 方法
req := &runtimeapi.ContainerStatsRequest{
ContainerId: os.Args[1],
// Verbose: true,
}

resp, err := criClient.ContainerStats(context.Background(), req)
if err != nil {
log.Fatalf("Failed to call ContainerStats: %v", err)
}

jsonData, err := json.MarshalIndent(resp, "", " ")
if err != nil {
log.Fatalf("Failed to marshal container stats: %v", err)
}

fmt.Println(string(jsonData))
}

编译后拷贝到故障机器上选个 coredns container id 执行:

$ ./stats-test f01c
{
"stats": {
...
"cpu": {
"timestamp": 1707204235253509570,
"usage_core_nano_seconds": {
"value": 2117105450225
}
},
...

cpu.Timestamp 获取是正常的,那就是之前获取到的值比每次 cpu.Timestamp 的值大,试试把日志后面的 1711900953458452657 转换下时间是:

Mon Apr 01 2024 00:02:33 GMT+0800

居然是 4 月,看下系统日志发现时间错乱过:

Apr  1 00:02:21 redhat79 etcd: {"level":"warn","ts":"2024-04-01T00:02:21.54608+0800","caller":"rafthttp/probing_status.go:82","msg":"prober found high clock drift","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"1354c3808ef4eeb9","clock-drift":"38.480218393s","rtt":"6.133426ms"}
Apr 1 00:02:25 redhat79 kubelet: I0401 00:02:25.365462 3553 kubelet_pods.go:894] "Unable to retrieve pull secret, the image pull may not succeed." pod="default123/xxxx-browsersvc-100-55dc46dd54-2vlbf" secret="" err="secret \"regcred\" not found"
Apr 1 00:02:26 redhat79 kube-apiserver: E0401 00:02:26.604914 1390 controller.go:193] "Failed to update lease" err="Operation cannot be fulfilled on leases.coordination.k8s.io \"apiserver-s7gqnvvc7gfrpthq3uyibizgba\": the object has been modified; please apply your changes to the latest version and try again"
Feb 4 23:44:00 redhat79 systemd: Time has been changed
Feb 4 23:44:00 redhat79 chronyd[917]: Backward time jump detected!

看了下 coredns 的 token 信息:

$ docker inspect f01 | grep :/var/run
"/data/kube/kubelet/pods/d8abc4c1-48f9-4a2f-9ce7-a649126bb43c/volumes/kubernetes.io~projected/kube-api-access-mzqjx:/var/run/secrets/kubernetes.io/serviceaccount:ro",
$ ls -l /data/kube/kubelet/pods/d8abc4c1-48f9-4a2f-9ce7-a649126bb43c/volumes/kubernetes.io~projected/kube-api-access-mzqjx
total 0
lrwxrwxrwx 1 root root 13 Feb 1 18:16 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 16 Feb 1 18:16 namespace -> ..data/namespace
lrwxrwxrwx 1 root root 12 Feb 1 18:16 token -> ..data/token
$ readlink -f /data/kube/kubelet/pods/d8abc4c1-48f9-4a2f-9ce7-a649126bb43c/volumes/kubernetes.io~projected/kube-api-access-mzqjx/token
/data/kube/kubelet/pods/d8abc4c1-48f9-4a2f-9ce7-a649126bb43c/volumes/kubernetes.io~projected/kube-api-access-mzqjx/..2024_03_31_15_57_54.3867697203/token
$ awk -F "." '{print $2}' /data/kube/kubelet/pods/d8abc4c1-48f9-4a2f-9ce7-a649126bb43c/volumes/kubernetes.io~projected/kube-api-access-mzqjx/token | base64 -d
{"aud":["https://kubernetes.default.svc"],"exp":1743436695,"iat":1711900695,"iss":"https://kubernetes.default.svc","kubernetes.io":{"namespace":"kube-system","pod":{"name":"coredns-f7bf7f79b-2brfr","uid":"d8abc4c1-48f9-4a2f-9ce7-a649126bb43c"},"serviceaccount":{"name":"coredns","uid":"b6255167-f016-48be-a118-9e6a9dcf3d7a"},"warnafter":1711904302},"nbf":1711900695,"sub":"system:serviceaccount:kube-system:coredns"}base64: invalid input

token 文件第二段是 jwt 信息,信息为:

  • iss(issuer):令牌的签发者
  • sub(subject):令牌所代表的主题(通常是 ServiceAccount 的名称)
  • aud(audience):令牌的预期接收者
  • exp(expiration time):令牌的过期时间
  • nbf(not before):令牌的生效时间
  • iat(issued at):令牌的签发时间

nbf 1711900695 转换为时间戳是 Sun Mar 31 2024 23:58:15 GMT+0800 在未来,所以 coredns 报错 401,解决办法就是两个:

  • 重启受影响的 pod/容器
  • 重启 kubelet

如果没有使用 csi,也就是 kubelet 没调用 mount ,可以重启 kubelet,有的情况下还是 restart ctr 稳妥


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK