当前位置: 首页 > news >正文

Docker-compose部署Alertmanager+Dingtalk+Prometheus+Grafana实现钉钉报警

部署监控

version: '3.7'services:
#dingtalkdingtalk:image: timonwong/prometheus-webhook-dingtalk:latestcontainer_name: dingtalkrestart: alwayscommand:- '--config.file=/etc/prometheus-webhook-dingtalk/config.yml'volumes:- /data/monitor/dingtalk/config.yml:/etc/prometheus-webhook-dingtalk/config.yml- /etc/localtime:/etc/localtime:roports:- "8060:8060"
#alertmanageralertmanager:image: prom/alertmanager:latestcontainer_name: alertmanagerrestart: alwaysvolumes:- /data/monitor/alertmanager/config/alertmanager.yml:/etc/alertmanager/alertmanager.ymlports:- "9093:9093"
#prometheusprometheus:image: prom/prometheuscontainer_name: prometheusrestart: alwaysports:- "9090:9090"volumes:- /data/monitor/promethues/prometheus.yml:/etc/prometheus/prometheus.yml- /data/monitor/promethues/alert.yml:/etc/prometheus/rule.yml- /etc/localtime:/etc/localtime:ro
#grafanagrafana:image: grafana/grafanacontainer_name: grafanarestart: alwaysports:- "3000:3000"volumes:- /data/monitor/grafana:/var/lib/grafana
#node-exporternode-exporter:image: prom/node-exportercontainer_name: node-exporterrestart: alwaysports:- "9100:9100"volumes:- /proc:/host/proc:ro- /sys:/host/sys:ro- /:/rootfs:ro    

Dingtalk配置文件

/data/monitor/dingtalk/config.yml

templates:- /etc/prometheus-webhook-dingtalk/templates/templates.tmpltargets: #配置多个接收方webhook2:url: https://oapi.dingtalk.com/robot/send?access_token=钉钉tokensecret: 钉钉加签

Alertmanager配置文件

/data/monitor/alertmanager/config/alertmanager.yml

global:resolve_timeout: 5msmtp_smarthost: 'smtp.qiye.163.com:465'             #邮箱smtp服务器代理,启用SSL发信, 端口一般是465smtp_from: 'user@163.com'              #发送邮箱名称smtp_auth_username: 'user@163.com'              #邮箱名称smtp_auth_password: 'password'                #邮箱密码或授权码smtp_require_tls: falseroute:receiver: 'default'group_wait: 10sgroup_interval: 1mrepeat_interval: 1hgroup_by: ['alertname']inhibit_rules:
- source_match:severity: 'critical'target_match:severity: 'warning'equal: ['alertname', 'instance']receivers:
- name: 'default'webhook_configs:- url: 'http://dingtalk-IP:8060/dingtalk/webhook2/send'   #webhoo2匹配dingtalk targetssend_resolved: true

Prometheus配置prometheus文件

/data/monitor/promethues/prometheus.yml

global:scrape_interval: 60sevaluation_interval: 60s
alerting:alertmanagers:- static_configs:- targets: ['IP:9093']
rule_files:- "/etc/prometheus/rule.yml"- "rules/*.yml"scrape_configs:- job_name: prometheusstatic_configs:- targets: ['localhost:9090']labels:instance: prometheus- job_name: litestatic_configs:- targets: ['IP:9100']labels:env: dev- job_name: redis_exporterstatic_configs:- targets: ['IP:9121']labels:env: devident: redis- job_name: mysql_exporterstatic_configs:- targets: ['IP:9104']labels:env: devident: mysql- job_name: emqx_exportermetrics_path: /api/v5/prometheus/statsscrape_interval: 5shonor_labels: truestatic_configs:- targets: ['IP:18083']- job_name: 'alertmanager'scrape_interval: 15sstatic_configs:- targets: ['IP:9100']

Prometheus配置alert文件

/data/monitor/promethues/alert.yml

groups:
- name: 服务器主机信息监控告警rules:- alert: 公司内部服务器监控expr: up {job="公司内部服务器"} == 0for: 0mlabels:severity: 非常严重annotations:description: "监控的目标已丢失,请检查服务器自身或node_exporter服务"- alert: "内存报警"expr: 100 - ((node_memory_MemAvailable_bytes * 100) / node_memory_MemTotal_bytes) > 10for: 1m  # 告警持续时间,超过这个时间才会发送给alertmanagerlabels:severity: 严重annotations:summary: "{{ $labels.instance }} 内存使用率过高,请尽快处理!"description: "{{ $labels.instance }}内存使用率超过95%,当前使用率{{ $value }}%."- alert: "磁盘空间报警"expr: (1 - node_filesystem_avail_bytes{fstype=~"ext4|xfs"} / node_filesystem_size_bytes{fstype=~"ext4|xfs"}) * 100 > 20for: 60slabels:severity: 严重annotations:summary: "{{ $labels.instance }}磁盘空间使用超过95%了"description: "{{ $labels.instance }}磁盘使用率超过95%,当前使用率{{ $value }}%."- alert: "CPU报警"expr: 100-(avg(irate(node_cpu_seconds_total{mode="idle"}[5m])) by(instance)* 100) > 5for: 120slabels:severity: 严重instance: "{{ $labels.instance }}"annotations:summary: "{{$labels.instance}}CPU使用率超过95%了"description: "{{ $labels.instance }}CPU使用率超过95%,当前使用率{{ $value }}%."- alert: "磁盘IO性能报警"expr: ((irate(node_disk_io_time_seconds_total[30m]))* 100) > 95for: 3mlabels:severity: 严重annotations:summary: "{{$labels.instance}} 流入磁盘IO使用率过高,请尽快处理!"description: "{{$labels.instance}} 流入磁盘IO大于95%,当前使用率{{ $value }}%."
http://www.lryc.cn/news/334831.html

相关文章:

  • 算法刷题记录 Day40
  • Android JNI基础
  • 裙边挡边带是什么
  • chabot项目介绍
  • ChromeOS 中自启动 Fcitx5 和托盘 stalonetray
  • 画图理解JVM相关内容
  • Scikit-Learn K均值聚类
  • 蓝桥杯 - 受伤的皇后
  • AcWing---乌龟棋---线性dp
  • python代码使用过程中使用快捷键注释时报错
  • go之web框架gin
  • SpringBoot 定时任务实践、定时任务按指定时间执行
  • MYSQL数据库故障排除与优化
  • 算法-数论-蓝桥杯
  • 222.完全二叉树节点个数
  • C++中的string类操作详解
  • Java绘图坐标体系
  • 【MATLAB源码-第38期】基于OFDM的块状导频和梳状导频误码率性能对比,以及LS/LMMSE两种信道估计方法以及不同调制方式对比。
  • javaWeb车辆管理系统设计与实现
  • 【DM8】间隔分区
  • 0基础如何进入IT行业?
  • C#将Console写至文件,且文件固定最大长度
  • 《CSS 知识点》仅在文本有省略号时添加 tip 信息
  • 彩虹聚合DNS管理系统v1.0全新发布
  • 3.10 Python数据类型转换
  • Kotlin基础学习
  • 配置交换机 SSH 管理和端口安全——实验1:配置交换机基本安全和 SSH管理
  • 海山数据库(He3DB)原理剖析:浅析Doris跨源分析能力
  • 第十三届蓝桥杯大赛软件赛省赛C/C++ 大学 B 组 题解
  • 20240324-1-集成学习面试题EnsembleLearning