文章詳情頁
Zabbix對Kafka topic積壓數據監控的問題(bug優化)
瀏覽:115日期:2023-04-06 15:09:24
目錄
- 簡述
- 分區自動發現
- 獲取監控項“test-group/test/分區X”的Lag
- 最終優化后腳本
- 接入Zabbix
- 1.Zabbix配置文件
- 2.Zabbix自動發現
- 3.監控項配置
- 4.告警信息
簡述
分區自動發現
# 未優化前的計算方式:# 自動發現配置文件vim consumer-groups.conf#按消費者組(Group)|Topic格式,寫入自動發現配置文件test-group|test# 執行腳本自動發現指定消費者和topic的分區bash consumer-groups.sh discovery{ "data": [{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"0" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"1" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"3" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"2" } ]}
經過上線驗證,當自動發現配置文件只有一個test-group|test
是沒有問題的,但當我們按需求再接入test-group|test1
(即test-group消費者組的第二個Topic)時,自動發現的結果如下:
# 未優化前的計算方式:vim consumer-groups.conf#按消費者組(Group)|Topic格式,寫入自動發現配置文件test-group|testtest-group|test1# 執行腳本自動發現指定消費者和topic的分區bash consumer-groups.sh discovery{ "data": [{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"0" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"1" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"3" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"2" }{ "{#GROUP}":"test-group", "{#TOPICP}":"test1", "{#PARTITION}":"0" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test2", "{#PARTITION}":"1" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test3", "{#PARTITION}":"2" } ]}
了解Zabbix自動發現格式的同學會發現,每個Topic的Partition會出現','
,這種格式是不符合規范,這就是導致我們的監控項會出現問題,因此我們需要進一步修改腳本。
經修改后,最終效果應該如下:
# 優化后的計算方式:vim consumer-groups.conf#按消費者組(Group)|Topic格式,寫入自動發現配置文件test-group|testtest-group|test1# 執行腳本自動發現指定消費者和topic的分區bash consumer-groups.sh discovery{ "data": [{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"0" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"1" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"3" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test", "{#PARTITION}":"2" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test1", "{#PARTITION}":"0" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test1", "{#PARTITION}":"1" },{ "{#GROUP}":"test-group", "{#TOPICP}":"test1", "{#PARTITION}":"2" } ]}
獲取監控項“test-group/test/分區X”的Lag
經過自動發現后的數據,我們可以進一步獲取不同分區的lag
# 優化后的計算方式:# test-group test分區0 lagbash consumer-groups.sh lag test-group test 0# test-group test分區1 lagbash consumer-groups.sh lag test-group test 1# test-group test1分區0 lagbash consumer-groups.sh lag test-group test1 0
通過命令可以看到,我們的參數通過消費者組、Topic、Partition來獲取最終的lag值,如果不加消費者區分,那么無法區分不同消費者組和不同Topic相應的lag結果:
# 未優化前的計算方式:# 獲取分區0 lagbash consumer-groups.sh lag 0# 獲取分區1 lagbash consumer-groups.sh lag 1# 獲取分區2 lagbash consumer-groups.sh lag 2# 獲取分區3 lagbash consumer-groups.sh lag 3
最終優化后腳本
# 自動發現配置文件vim consumer-groups.conf#按消費者組(Group)|Topic格式,寫入自動發現配置文件test-group|testtest-group|test1# 自動發現、lag計算腳本vim consumer-groups.sh#!/bin/bash##comment: 根據消費者組監控topic lag,進行監控告警#配置文件說明#消費者組|Topic#test-group|test#獲取topic 信息cal_topic() { if [ $# -ne 2 ]; thenecho "parameter num error, 讀取topic信息失敗"exit 1 else/usr/local/kafka/bin/./kafka-consumer-groups.sh --bootstrap-server 192.168.3.55:9092 --describe --group $1 |grep -w $2|grep -v none fi}#topic+分區自動發現topic_discovery() { printf "{\n" printf "\t\"data\": [\n" m=0 num=`cat /etc/zabbix/monitor_scripts/consumer-groups.conf|wc -l` for line in `cat /etc/zabbix/monitor_scripts/consumer-groups.conf` do m=`expr $m + 1`group=`echo ${line} | awk -F"|" "{print $1}"`topic=`echo ${line} | awk -F"|" "{print $2}"`cal_topic $group $topic > /tmp/consumer-group-tmpcount=`cat /tmp/consumer-group-tmp|wc -l`n=0while read linedo n=`expr $n + 1` #判斷最后一行 if [ $n -eq $count ] && [ $m -eq $num ]; then topicp=`echo $line | awk "{print $1}"` partition=`echo $line | awk "{print $2}"` printf "\t\t{ \"{#GROUP}\":\"${group}\", \"{#TOPICP}\":\"${topicp}\", \"{#PARTITION}\":\"${partition}\" }\n" else topicp=`echo $line | awk "{print $1}"` partition=`echo $line | awk "{print $2}"` printf "\t\t{ \"{#GROUP}\":\"${group}\", \"{#TOPICP}\":\"${topicp}\", \"{#PARTITION}\":\"${partition}\" },\n" fidone < /tmp/consumer-group-tmp done printf "\t]\n" printf "}\n"}if [ $1 == "discovery" ]; then topic_discoveryelif [ $1 == "lag" ];then cal_topic $2 $3 > /tmp/consumer-group cat /tmp/consumer-group |awk -v t=$3 -v p=$4 "{if($1==t && $2==p ){print $5}}"else echo "Usage: /data/scripts/consumer-group.sh discovery | lag"fi# 手動運行## 自動發現bash consumer-groups.sh discovery## test-group test分區0 lagbash consumer-groups.sh lag test-group test 0
接入Zabbix
1.Zabbix配置文件
vim userparameter_kafka.confUserParameter=topic_discovery,bash /data/scripts/consumer-groups.sh discoveryUserParameter=topic_log[*],bash /data/scripts/consumer-groups.sh lag "$1" "$2" "$3"
2.Zabbix自動發現
3.監控項配置
4.告警信息
告警主機:Kafka_192.168.3.55主機IP:192.168.3.55主機組:Kafka告警時間:2022.03.21 00:23:10告警等級:Average告警信息:test-group/test/分區1:數據積壓100告警項目:topic_lag[test-group,test,1]問題詳情:test-group/test/1: 62
到此這篇關于Zabbix對Kafka topic積壓數據監控的文章就介紹到這了,更多相關Zabbix積壓數據監控內容請搜索以前的文章或繼續瀏覽下面的相關文章希望大家以后多多支持!
標簽:
Zabbix
相關文章:
排行榜