21

基于Loki打造云原生分布式日志系统

 3 years ago
source link: https://zhuanlan.zhihu.com/p/264443818
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

简介

Grafana Loki是一个水平可扩展,高可用性,多租户的日志聚合系统,包含了日志收集,存储,可视化以及报警等功能。

与其他日志系统不同,Loki的构想是仅对日志建立标签索引,而使原始日志消息保持未索引状态。这意味着Loki的运营成本更低,并且效率更高。

Loki特别适合存储Kubernetes Pod日志。诸如Pod标签之类的元数据会自动被抓取并建立索引。

MJ3AB3V.jpg!mobile

与EFK对比

EFK(Elasticsearch,Fluentd,Kibana)技术栈用于收集,可视化和查询来自各种来源的日志。 Elasticsearch中的数据作为非结构化JSON对象存储在磁盘上。每个对象的键和每个键的内容都被索引。然后可以使用JSON对象或定义为Lucene的查询语言来查询数据。

相比之下,Loki在单二进制模式下可以将数据存储在磁盘上,但是在水平可伸缩模式下,数据存储在云存储系统(例如S3,GCS或Cassandra)中。日志以纯文本格式存储,并带有一组标签名称和值,其中仅对标签对进行索引。这种折衷使得它比全索引更具备成本优势,并且允许开发人员从其应用程序积极地进行日志记录。使用LogQL查询Loki中的日志。但是,由于这种设计的折衷,基于内容(即日志行中的文本)进行过滤的LogQL查询需要加载搜索窗口中与查询中定义的标签匹配的所有块。

此外,我们知道metrcis和alert只能揭示预定义的问题,未知的问题还得从Log里边查找。日志和 metric 分在两个系统,这增加了排查问题的难度。我们的日志和metrcis系统需要建立联系,而灵感来源于prometheus的loki,恰好解决了这个问题。

Loki架构

Loki大体架构如下:

R3UFN3A.jpg!mobile

接下来我们介绍一些核心组件:

  • Distributor -- Distributor 服务负责处理客户端的写入流。这是日志数据写入路径中的第一站。Distributor收到一组流后,将验证每个流的正确性并确保其在配置的租户(或全局)限制之内。然后,将有效块拆分为多个批次,并并行发送到多个ingester。

Distributor 使用一致性哈希和可配置的复制因子,以确定ingester服务的哪些实例应接收给定的流。流是与租户和唯一标签集关联的一组日志。使用租户ID和标签集对流进行散列,然后使用散列查找将流发送到的实例。

  • Ingester -- Ingester服务负责在写入路径上将日志数据写入到长期存储后端(DynamoDB,S3,Cassandra等),并在读取路径上返回日志数据以进行内存中查询。

Ingester包含一个生命周期器,该生命周期器管理哈希环中ingester的生命周期。每个ingester状态为以下状态中的一种: PENDINGJOININGACTIVELEAVINGUNHEALTHY

  • Query frontend -- Query frontend是一项可选服务,可提供查询器的API终结点,并可用于加速读取路径。当Query frontend就位时,应将传入的查询请求定向到Query frontend,而不是Querier。为了执行实际查询,集群中仍将需要Querier服务。
    Query frontend在内部执行一些查询调整,并将查询保存在内部队列中。在此设置中,Queriers充当工作人员,将工作从队列中拉出,执行,然后将其返回到Query frontend进行聚合。Queriers需要配置Query frontend地址(通过-querier.frontend-address CLI标志),以允许Queriers连接到Query frontend。
    Query frontend是无状态的。但是,由于内部队列的工作原理,建议运行一些Query frontend副本以充分利用公平调度的好处。在大多数情况下,两个副本就足够了。
  • Querier --Querier 使用LogQL 查询语言处理查询,同时从ingester和长期存储中获取日志。
    Querier将查询所有内存中的内存数据,然后回退到针对后端存储运行相同的查询。由于副本因素,Querier可能会收到重复的数据。为解决此问题,Querier在内部对具有相同纳秒级时间戳,标签集和日志消息的数据进行重复数据删除。
  • Chunk Store -- Chunk Store是Loki的长期数据存储,旨在支持交互式查询和持续写入,而无需后台维护任务。它包括:

块的索引。该索引可以通过以下方式支持:

块数据本身的键值(KV)存储,可以是:

当然还可以包括做日志报警的ruler组件以及负责在其时间段开始之前创建周期表,并在其数据时间范围超出保留期限时将其删除的table-manager。

部署

Loki微服务部署模式,涉及组件比较多,我们生产环境使用k8s部署,当然涉及到敏感信息已经去掉。

Chuck 存储选择的是s3,index存储选择的是Cassandra。

1:创建s3 桶,然后将aksk添加到下面的配置文件中。Cassandra 集群搭建我们这里不再讲述。

2:部署loki的配置文件,

apiVersion: v1
data:
  config.yaml: |
    chunk_store_config:
        chunk_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached.loki.svc.cluster.local
                service: memcached-client
        max_look_back_period: 0
        write_dedupe_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached-index-writes.loki.svc.cluster.local
                service: memcached-client
    auth_enabled: false
    distributor:
        ring:
            kvstore:
                store: memberlist
    frontend:
        compress_responses: true
        log_queries_longer_than: 5s
        max_outstanding_per_tenant: 200
    frontend_worker:
        frontend_address: query-frontend.loki.svc.cluster.local:9095
        grpc_client_config:
            max_send_msg_size: 1.048576e+08
        parallelism: 2
    ingester:
        chunk_block_size: 262144
        chunk_idle_period: 15m
        lifecycler:
            heartbeat_period: 5s
            interface_names:
              - eth0
            join_after: 30s
            num_tokens: 512
            ring:
                kvstore:
                    store: memberlist
                replication_factor: 3
        max_transfer_retries: 0
    ingester_client:
        grpc_client_config:
            max_recv_msg_size: 6.7108864e+07
        remote_timeout: 1s
    limits_config:
        enforce_metric_name: false
        ingestion_burst_size_mb: 20
        ingestion_rate_mb: 10
        ingestion_rate_strategy: global
        max_cache_freshness_per_query: 10m
        max_global_streams_per_user: 10000
        max_query_length: 12000h
        max_query_parallelism: 16
        max_streams_per_user: 0
        reject_old_samples: true
        reject_old_samples_max_age: 168h
    querier:
        query_ingesters_within: 2h
    query_range:
        align_queries_with_step: true
        cache_results: true
        max_retries: 5
        results_cache:
            cache:
                memcached_client:
                    consistent_hash: true
                    host: memcached-frontend.loki.svc.cluster.local
                    max_idle_conns: 16
                    service: memcached-client
                    timeout: 500ms
                    update_interval: 1m
        split_queries_by_interval: 30m
    ruler: {}
    schema_config:
        configs:
          - from: "2020-05-15"
            index:
                period: 168h
                prefix: cassandra_table
            object_store: s3
            schema: v11
            store: cassandra
    server:
        graceful_shutdown_timeout: 5s
        grpc_server_max_concurrent_streams: 1000
        grpc_server_max_recv_msg_size: 1.048576e+08
        grpc_server_max_send_msg_size: 1.048576e+08
        http_listen_port: 3100
        http_server_idle_timeout: 120s
        http_server_write_timeout: 1m
    storage_config:
        cassandra:
            username: loki-superuser
            password: xxx
            addresses: loki-dc1-all-pods-service.cass-operator.svc.cluster.local
            auth: true
            keyspace: lokiindex
        aws:
            bucketnames: xx
            endpoint: s3.amazonaws.com
            region: ap-southeast-1
            access_key_id: xx
            secret_access_key: xx
            s3forcepathstyle: false
        index_queries_cache_config:
            memcached:
                batch_size: 100
                parallelism: 100
            memcached_client:
                consistent_hash: true
                host: memcached-index-queries.loki.svc.cluster.local
                service: memcached-client
    memberlist:
        abort_if_cluster_join_fails: false
        bind_port: 7946
        join_members:
        - loki-gossip-ring.loki.svc.cluster.local:7946
        max_join_backoff: 1m
        max_join_retries: 10
        min_join_backoff: 1s
    table_manager:
        creation_grace_period: 3h
        poll_interval: 10m
        retention_deletes_enabled: true
        retention_period: 168h
kind: ConfigMap
metadata:
  name: loki
  namespace: loki
---
apiVersion: v1
data:
  overrides.yaml: |
    overrides: {}
kind: ConfigMap
metadata:
  name: overrides
  namespace: loki

2:部署依赖的4个memcached:

memcached-frontend.yaml如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-frontend
  name: memcached-frontend
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-frontend
  serviceName: memcached-frontend
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-frontend
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-frontend
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 5m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-frontend
  name: memcached-frontend
  namespace: loki
spec:
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-frontend

memcached-index-queries.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-index-queries
  name: memcached-index-queries
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-index-queries
  serviceName: memcached-index-queries
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-index-queries
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-index-queries
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 5m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-index-queries
  name: memcached-index-queries
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-index-queries

memcached-index-writes.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached-index-writes
  name: memcached-index-writes
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached-index-writes
  serviceName: memcached-index-writes
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached-index-writes
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached-index-writes
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 1024
        - -I 1m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 1536Mi
          requests:
            cpu: 500m
            memory: 1329Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached-index-writes
  name: memcached-index-writes
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached-index-writes

memcached.yaml 如下:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    app: memcached
  name: memcached
  namespace: loki
spec:
  replicas: 3
  selector:
    matchLabels:
      app: memcached
  serviceName: memcached
  template:
    metadata:
      annotations:
        prometheus.io/scrape: "true"
        prometheus.io/port: "9150"   
      labels:
        app: memcached
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: memcached
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -m 4096
        - -I 2m
        - -c 1024
        - -v
        image: memcached:1.5.17-alpine
        imagePullPolicy: IfNotPresent
        name: memcached
        ports:
        - containerPort: 11211
          name: client
        resources:
          limits:
            cpu: "3"
            memory: 6Gi
          requests:
            cpu: 500m
            memory: 5016Mi
      - args:
        - --memcached.address=localhost:11211
        - --web.listen-address=0.0.0.0:9150
        image: prom/memcached-exporter:v0.6.0
        imagePullPolicy: IfNotPresent
        name: exporter
        ports:
        - containerPort: 9150
          name: http-metrics
  updateStrategy:
    type: RollingUpdate

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: memcached
  name: memcached
  namespace: loki
spec:
  clusterIP: None
  ports:
  - name: memcached-client
    port: 11211
    targetPort: 11211
  selector:
    app: memcached

对于这4个memcached 的作用,大家可以结合loki的配置文件和架构图查阅。

4:在Loki中,ring是由tokens分成较小段的空间。每个段都属于单个“ ingester”,用于对多个ingester的系列/日志进行分片。除tokens外,每个实例还具有其ID,地址和定期更新的最新心跳时间戳。这允许其他组件(distributors 和 queriers)发现哪些inester是可用的和有效的。

支持consul ,etcd,memberlist(gossip) 等实现。 我们为了减少不必要的依赖,选择了memberlist。

所以需要部署一个service:

gossip_ring.yaml 如下:

apiVersion: v1
kind: Service
metadata:
  labels:
    name: loki-gossip-ring
  name: loki-gossip-ring
  namespace: loki
spec:
  ports:
  - name: gossip-ring
    port: 7946
    targetPort: 7946
    protocol: TCP
  selector:
    gossip_ring_member: 'true'

5:部署distributor

apiVersion: apps/v1
kind: Deployment
metadata:
  name: distributor
  namespace: loki
  labels:
    app: distributor
spec:
  minReadySeconds: 10
  replicas: 3
  revisionHistoryLimit: 10
  selector:
    matchLabels:
        app: distributor
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: distributor
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: distributor
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=distributor
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: distributor
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring       
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
          requests:
            cpu: 500m
            memory: 500Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: distributor
  name: distributor
  namespace: loki
spec:
  ports:
  - name: distributor-http-metrics
    port: 3100
    targetPort: 3100
  - name: distributor-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: distributor

6:部署ingester

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: ingester
  namespace: loki
  labels:
    app: ingester
spec:
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  serviceName: ingester
  selector:
    matchLabels:
      app: ingester
  strategy:
    rollingUpdate:
      maxSurge: 0
      maxUnavailable: 1
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        name: ingester
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      securityContext:
        fsGroup: 10001
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: ingester
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=ingester
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: ingester
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring 
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: "2"
            memory: 10Gi
          requests:
            cpu: "1"
            memory: 5Gi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
        - mountPath: /data
          name: ingester-data
      terminationGracePeriodSeconds: 4800
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides
  volumeClaimTemplates:
  - metadata:
      labels:
        app: querier
      name: ingester-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: gp2
---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: ingester
  name: ingester
  namespace: loki
spec:
  ports:
  - name: ingester-http-metrics
    port: 3100
    targetPort: 3100
  - name: ingester-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: ingester

7:部署query-frontend

apiVersion: apps/v1
kind: Deployment
metadata:
  name: query-frontend
  namespace: loki
  labels:
    app: query-frontend
spec:
  minReadySeconds: 10
  replicas: 2
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: query-frontend
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: query-frontend
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: query-frontend
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -log.level=debug
        - -target=query-frontend
        image: grafana/loki:master-92ace83
        imagePullPolicy: IfNotPresent
        name: query-frontend
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            memory: 1200Mi
          requests:
            cpu: "2"
            memory: 600Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: query-frontend
  name: query-frontend
  namespace: loki
spec:
  clusterIP: None
  publishNotReadyAddresses: true
  ports:
  - name: query-frontend-http-metrics
    port: 3100
    targetPort: 3100
  - name: query-frontend-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: query-frontend

8:部署querier

apiVersion: apps/v1
kind: StatefulSet
metadata:
  labels:
    name: querier
  name: querier
  namespace: loki
spec:
  updateStrategy:
    type: RollingUpdate
  replicas: 3
  serviceName: querier
  selector:
    matchLabels:
      app: querier
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: querier
        gossip_ring_member: 'true'
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      securityContext:
        fsGroup: 10001
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchLabels:
                app: querier
            topologyKey: kubernetes.io/hostname
      containers:
      - args:
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=querier
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: querier
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        - containerPort: 7946
          name: gossip-ring 
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          requests:
            cpu: "4"
            memory: 2Gi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
        - mountPath: /etc/loki/overrides
          name: overrides
        - mountPath: /data
          name: querier-data
      volumes:
      - configMap:
          name: loki
        name: loki
      - configMap:
          name: overrides
        name: overrides
  volumeClaimTemplates:
  - metadata:
      labels:
        app: querier
      name: querier-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 10Gi
      storageClassName: gp2

---
apiVersion: v1
kind: Service
metadata:
  labels:
    name: querier
  name: querier
  namespace: loki
spec:
  ports:
  - name: querier-http-metrics
    port: 3100
    targetPort: 3100
  - name: querier-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: querier

9:部署table manager

apiVersion: apps/v1
kind: Deployment
metadata:
  name: table-manager
  namespace: loki
  labels:
    app: table-manager
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: table-manager
  template:
    metadata:
      annotations:
        prometheus.io/path: /metrics
        prometheus.io/port: "3100"
        prometheus.io/scrape: "true"
      labels:
        app: table-manager
    spec:
      nodeSelector:
        group: loki
      tolerations:
      - effect: NoExecute
        key: app
        operator: Exists
      containers:
      - args:
        - -bigtable.backoff-on-ratelimits=true
        - -bigtable.grpc-client-rate-limit=5
        - -bigtable.grpc-client-rate-limit-burst=5
        - -bigtable.table-cache.enabled=true
        - -config.file=/etc/loki/config/config.yaml
        - -limits.per-user-override-config=/etc/loki/overrides/overrides.yaml
        - -target=table-manager
        image: grafana/loki:1.6.1
        imagePullPolicy: IfNotPresent
        name: table-manager
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 9095
          name: grpc
        readinessProbe:
          httpGet:
            path: /ready
            port: 3100
          initialDelaySeconds: 15
          timeoutSeconds: 1
        resources:
          limits:
            cpu: 200m
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 100Mi
        volumeMounts:
        - mountPath: /etc/loki/config
          name: loki
      volumes:
      - configMap:
          name: loki
        name: loki

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app: table-manager
  name: table-manager
  namespace: loki
spec:
  ports:
  - name: table-manager-grpc
    port: 9095
    targetPort: 9095
  selector:
    app: table-manager

部署完成之后,可以查看所有pod运行状态:

kubectl get pods -n loki
NAME                              READY   STATUS    RESTARTS   AGE
distributor-84747955fb-hhtzl      1/1     Running   0          8d
distributor-84747955fb-pq9wn      1/1     Running   0          8d
distributor-84747955fb-w66hp      1/1     Running   0          8d
ingester-0                        1/1     Running   0          8d
ingester-1                        1/1     Running   0          8d
ingester-2                        1/1     Running   0          8d
memcached-0                       2/2     Running   0          3d2h
memcached-1                       2/2     Running   0          3d2h
memcached-2                       2/2     Running   0          3d2h
memcached-frontend-0              2/2     Running   0          3d2h
memcached-frontend-1              2/2     Running   0          3d2h
memcached-frontend-2              2/2     Running   0          3d2h
memcached-index-queries-0         2/2     Running   0          3d2h
memcached-index-queries-1         2/2     Running   0          3d2h
memcached-index-queries-2         2/2     Running   0          3d2h
memcached-index-writes-0          2/2     Running   0          3d3h
memcached-index-writes-1          2/2     Running   0          3d3h
memcached-index-writes-2          2/2     Running   0          3d3h
querier-0                         1/1     Running   0          3d3h
querier-1                         1/1     Running   0          3d3h
querier-2                         1/1     Running   0          3d3h
query-frontend-6c8ffc8667-qj5zq   1/1     Running   0          8d
table-manager-c4fdf6475-zjzqg     1/1     Running   0          8d

至此,部署工作完成。至于promtail插件,本文不作介绍,目前支持loki数据源的插件比较多,比如fluent bit 等。

可视化

Grafana已经通过Explore组件支持对loki直接查询。

nau2ayu.jpg!mobile

通过标签组合来实现查询。

总结

如果对全文索引不是强需求,而且loki是k8s 日志系统的一个比较好的选择。


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK