「Alertmanager」- Prometheus 的告警发送工具

认识

针对该部分内容,其均来自于官方文档,但仅提取出我们需要关注的内容,能让我们形成对 Alertmanger 的整体认识。
子章节将记录如何使用 Alertmanager 完成具体的工作任务,比如配置 Slack 告警、定制告警消息格式。

ALERTING
— Alerting overview: Concepts and Fundamentals
— Alertmanager: Concepts and Fundamentals
— Configuration: 3.1 Functions and Features
— Clients: Alert Testing
— Notification template reference: Notification Template
— Notification template examples: Notification Template
— Management API: 3.1 Functions and Features
— HTTPS and authentication: 3.1 Functions and Features

组成

WIP

性质

告警通知渠道

webhook_config
email_config

sns_config AWS SNS https://sns.us-east-2.amazonaws.com
telegram_config Telegram
webex_config Webex
wechat_config https://developers.weixin.qq.com/doc/offiaccount/en/Message_Management/Service_Center_messages.html

msteams_config https://learn.microsoft.com/en-us/microsoftteams/platform/webhooks-and-connectors/what-are-webhooks-and-connectors
msteamsv2_config https://support.microsoft.com/en-gb/office/8ae491c7-0394-4861-ba59-055e33f75498
jira_config https://developer.atlassian.com/cloud/jira/platform/rest/v2/intro/
discord_config https://discord.com/developers/docs/resources/webhook
opsgenie_config https://docs.opsgenie.com/docs/alert-api
pagerduty_config https://developer.pagerduty.com/documentation/integration/events
pushover_config https://pushover.net/api
rocketchat_config https://developer.rocket.chat/reference/api/rest-api/endpoints/messaging/chat-endpoints/postmessage
victorops_config https://help.victorops.com/knowledge-base/rest-endpoint-integration-guide/

构建

安装部署 | 升级

关于安装方法

参考 alertmanager/README.md at master 页面。如果希望以容器运行,参考 prom/alertmanager 页面。

应用

我们将学习 Alertmanager 的使用方法,并整理学习笔记,学习资料以官方文档为主,辅以其他网络资料。

该笔记将记录:Alertmanager 的使用方法,其主要内容是对官方文档的学习、记录、整理,还包含部分常用配置示例。

改进

Awesome Prometheus alerts | Collection of alerting rules

参考

How to check your prometheus.yml is valid – Robust Perception | Prometheus Monitoring Experts
How do I configure the log level of Prometheus’ Alertmanager? – Server Fault
alertmanager/simple.yml at master · prometheus/alertmanager
What’s the difference between group_interval, group_wait, and repeat_interval? – Robust Perception | Prometheus Monitoring Experts
Prometheus: understanding the delays on alerting