![](/style/images/good.png)
![](/style/images/bad.png)
使用Prometheus和Grafana监控golang服务
source link: https://studygolang.com/articles/25599
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
环境
centOS 7.0
Prometheus2.14.0
Grafana6.5.2
下载安装Prometheus
wget https://github.com/prometheus/prometheus/releases/download/v2.14.0/prometheus-2.14.0.linux-386.tar.gz tar -xavf prometheus-2.14.0.linux-386.tar.gz
启动
在解压目录里就有缺省的配置文件prometheus.yml。可以不用修改直接使用启动。
./prometheus --config.file=prometheus.yml
在浏览器中输入主机IP:9090访问就能看到Prometheus界面
时序类型
<1>Counter:计数器,数据的值持续增加或持续减少。表示的是一个持续变化趋势值,用来记录当前的数量。一般用于记录当前请求数量,错误数
<2>Gauge:计量器(类似仪表盘)。表示当前数据的一个瞬时值,改值可任意增加或减少。一般用来记录内存使用量,磁盘使用量,文件打开数量等。
<3>Histogram:柱状图。主要用于在一定范围内对数据进行采样,计算在一定范围内的分布情况,通常它采集的数据展示为直方图。一般用来记录请求时长或响应时长
<4>Summary:摘要。主要用于表示一段时间内数据采样结果。总量,而不是根据统计区间计算出来
Grafana
下载
wget https://dl.grafana.com/oss/release/grafana-6.5.2-1.x86_64.rpm
安装
sudo yum localinstall grafana-6.5.2-1.x86_64.rpm
启动
systemctl daemon-reload systemctl start grafana-server systemctl status grafana-server
配置文件
配置文件在/etc/sysconfig/grafana-server
GRAFANA_USER=grafana GRAFANA_GROUP=grafana GRAFANA_HOME=/usr/share/grafana LOG_DIR=/var/log/grafana DATA_DIR=/var/lib/grafana MAX_OPEN_FILES=10000 CONF_DIR=/etc/grafana CONF_FILE=/etc/grafana/grafana.ini RESTART_ON_UPGRADE=true PLUGINS_DIR=/var/lib/grafana/plugins PROVISIONING_CFG_DIR=/etc/grafana/provisioning # Only used on systemd systems PID_FILE_DIR=/var/run/grafana
访问
浏览器输入IP:3000,初次登陆帐号和密码都是admin
进入后会要求生成初次数据源(create your first data source)
生成新的dashboard
实例
接下来做几个实际的例子看看实际效果
测试代码请到 例子代码
Counter
例子监控rpc的数量。counter的计数是不断累加的
golang代码,关键部分
//Create a new CounterVec rpcCounter = prometheus.NewCounterVec( prometheus.CounterOpts{ Name: "rpc_counter", Help: "RPC counts", }, []string{"api"}, ) //registers the provided collector prometheus.MustRegister(rpcCounter) //Add the given value to counter rpcCounter.WithLabelValues("api_bookcontent").Add(float64(rand.Int31n(50))) rpcCounter.WithLabelValues("api_chapterlist").Add(float64(rand.Int31n(10)))
在prometheus的配置文件中添加
- job_name: 'req-monitor' static_configs: - targets: ['localhost:8082'] labels: group: 'newgroup1'
重启prometheus
ps -aux | grep prometheus kill -9 xxxx ./prometheus --config.file=prometheus.yml
编译程序(在linux下运行)
GOOS=linux go build
执行
./prometheus_rpc_http -listen-address=:8082 &
在prometheus下查看
在Grafana下新建dashboard
其中计算公式为 rate(rpc_counter[1m]) 意思是 对1minute 的rpc_counter值取平均
可以看到其中有两条线 api="api_bookcontent", api="api_chapterlist"正是我们在代码中通过rpcCounter.WithLabelValues()设置的label
Gauge
golang关键部分代码
rpcReqSize = prometheus.NewGaugeVec( prometheus.GaugeOpts{ Name: "rpc_req_size", Help: "RPC request size", }, []string{"api"}, ) prometheus.MustRegister(rpcReqSize) rpcReqSize.WithLabelValues("api_bookcontent").Set(float64(rand.Int31n(8000))) rpcReqSize.WithLabelValues("api_chapterlist").Set(float64(rand.Int31n(5000)))
在prometheus下查看
在Grafana下新建dashboard
Histogram
golang关键部分代码
httpReqDurationsHistogram = prometheus.NewHistogramVec( prometheus.HistogramOpts{ Name: "http_req_durations_histogram", Help: "http req latency distributions.", // 4 buckets, starting from 0.1 and adding 0.5 between each bucket Buckets: prometheus.LinearBuckets(0.1, 0.5, 4), }, []string{"http_req_histogram"}, ) prometheus.MustRegister(httpReqDurationsHistogram) v := rand.Float64() httpReqDurationsHistogram.WithLabelValues("booksvc_req").Observe(1.5 * v)
prometheus下查看
可以看到我们在代码中定义了4个buckets,在图中就有对应的四个buckets数据(le="0.1",le="0.6",le="1.1",le="1.6")
在Grafana下新建dashboard
计算公式使用rate(http_req_durations_histogram_bucket[30s])
计算30s http_req_durations_histogram_bucket的平均值
根据数值可以看到0.1秒响应的占1.3%, 0.6秒内占17.3%, 1.1秒内响应的占34.7, 1.6秒内响应的占60%
Summary
golang关键代码
rpcDurations = prometheus.NewSummaryVec( prometheus.SummaryOpts{ Name: "rpc_durations_seconds", Help: "RPC latency distributions.", Objectives: map[float64]float64{0.5: 0.5, 0.9: 1.5, 0.99: 2.0}, }, []string{"service"}, ) prometheus.MustRegister(rpcDurations) v = rand.Float64() rpcDurations.WithLabelValues("user_rpc").Observe(v) v = 0.5 + rand.Float64() rpcDurations.WithLabelValues("book_rpc").Observe(v) v = 1.0 + rand.Float64() rpcDurations.WithLabelValues("bookshelf_rpc").Observe(v)
在prometheus下查看
在Grafana下新建dashboard
计算公式为rate(rpc_durations_seconds_sum[1m])
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK