10

prometheus、alertmanager 告警误报问题

 2 years ago
source link: https://www.v2ex.com/t/807081
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

V2EX  ›  程序员

prometheus、alertmanager 告警误报问题

  Geekerstar · 5 小时 40 分钟前 · 144 次点击

规则是这样配置的

  - alert: 服务器宕机
    expr: up == 0
    for: 1m
    labels:
      severity: emergency
    annotations:
      summary: "{{$labels.instance}}:服务器宕机"
      description: "{{$labels.instance}}:服务器延时超过 5 分钟"
      
    - alert: MySQL 宕机 
    expr: up == 0
    for: 5s 
    labels:
      severity: emergency
    annotations:
      summary: "{{$labels.instance}}: MySQL 宕机!!!"
      description: "请检测 MySQL 数据库运行状态"

遇到一个误报,服务器和 MySQL 都正常的情况却触发了上面两个告警规则

请问大家有没有遇到过误报的情况呢?是配置没有对还是什么原因呢?另外有没有办法去查询告警的历史记录呢?目前是通过邮件接收的


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK