Configure Alerting
AlertmanagerConfig
Alertmanager can be configured in a variety of ways. Our recommended practice is to create AlertmanagerConfig resources, which are automatically incorproated into Alertmanager's configuration. This object's specification differs somewhat from that used in Alertmanager's internal configuration file (alertmanager.yml
), and the examples below will use the AlertmanagerConfig specification.
By default, these are namespaced resources, meaning they will only apply to alerts that have a namespace
label that matches the namespace of the AlertmanagerConfig resource. However, one AlertmanagerConfig can be treated as a global configuration, and the namespace will be ignored in that instance. Instructions on setting an AlertmanagerConfig to global are below.
Understand Routes and Receivers
Alertmanager uses "routes" to identify alerts that need to be "routed" to a notification "reciever." These Routes use "matchers" to find applicable Alerts, then send the Alert to a Receiver. Alertmanager has a default Route that matches to all alerts, and then a list of child routes that can have their own matching criteria. Typically, the default Route will be paired with a "null" Receiver, so that you are not bombarded with every alert possible. In this example, I will use a child Route to match all alerts with a label of severity=critical
.
route:
groupBy:
- namespace
groupInterval: 5m
groupWait: 30s
receiver: "null"
repeatInterval: 12h
routes:
- groupBy:
- alertname
matchers:
- name: severity
value: critical
receiver: slack
In the example code below, alerts have been routed to a Receiver called "slack." The receiver is linked to a Slack App using a webhook so that the channel can receive alert notifications. See the Slack documentation for details on creating a webhook.
With a Slack webhook URL, the Slack Receiver can be defined similar to the example below. Note the "null" Receiver on the list. This is needed to match with the default Route.
receivers:
- name: "null"
- name: slack
slackConfigs:
- apiUrl:
key: api-url
name: slack-notification-api-url
sendResolved: true
text: |-
{{ range .Alerts }}{{ .Annotations.description }}
{{ end }}
title: |-
{{ range .Alerts }}[{{ .Status | toUpper }}] {{ alertname }} - {{ .Annotations.summary }}
{{ end }}
The AlertmanagerConfig specification requires the Slack API URL be placed in a Secret. See the reference above under receivers[1].slackConfigs[0].apiUrl
, where name
is the name of the Kubernetes Secret and key
is the name of the key inside the Secret containing the URL.
This also contains some formatting instructions to make our Slack message more readable.
For a more detailed understanding of Alertmanager, including other options for Routes and Receivers, see the Alertmanager documentation and the AlertmanagerConfig specification.
Example AlertmanagerConfig
Below is a full example AlertmanagerConfig that combines the examples above with typical default values. It is important to note that when an AlertmanagerConfig is set as global, it entirely replaces the default Alertmanager configuration. Thus, any default configurations must be included, or they will be lost.
apiVersion: monitoring.coreos.com/v1alpha1
kind: AlertmanagerConfig
metadata:
name: global-alertmanagerconfig
namespace: prometheus
spec:
inhibitRules:
- equal:
- namespace
- alertname
sourceMatch:
- name: severity
value: critical
targetMatch:
- matchType: =~
name: severity
value: warning|info
- equal:
- namespace
- alertname
sourceMatch:
- name: severity
value: warning
targetMatch:
- name: severity
value: info
- equal:
- namespace
sourceMatch:
- name: alertname
value: InfoInhibitor
targetMatch:
- name: severity
value: info
receivers:
- name: "null"
- name: slack
slackConfigs:
- apiURL:
key: api-url
name: slack-notification-api-url
sendResolved: true
text: |-
{{ range .Alerts }}{{ .Annotations.description }}
Severity: {{ .Labels.severity }}
{{ end }}
title: |-
{{ range .Alerts }}[{{ .Status | toUpper }}] {{ .Annotations.summary }}
{{ end }}
route:
groupBy:
- namespace
groupInterval: 5m
groupWait: 30s
receiver: "null"
repeatInterval: 12h
routes:
- matchers:
- matchType: =~
name: alertname
receiver: "null"
value: InfoInhibitor|Watchdog
- matchers:
- name: severity
value: critical
receiver: slack
Setting AlertmanagerConfig to be global
The AlertmanagerConfig resource that is to be marked as global must be in the same namespace as Prometheus. If Prometheus was installed using the Privacy Dynamics installer, ensure your resource is in the prometheus
namespace, then follow these instructions in the Privacy Dynamics Installer.
- Click "Config" on the top menu bar to edit deployment values
- In the "Auxiliary Settings" section, check the box labeled "Specify custom Alertmanager alert settings"
- Give the name of the AlertmanagerConfig resource in the text box that appears
- Click "Save config"
- On the next page, click "Deploy" to deploy your changes.
If Prometheus was installed manually with the Helm chart (see instructions), add the following values to the Helm chart.
alertmanager:
alertmanagerSpec:
alertmanagerConfiguration:
name: global-alertmanagerconfig # update with resource name