zabiix alertmanager

Zabbix Alertmanager

Recently, we released the Zabbix Alertmanager integration to the open source, which can be downloaded from the GitHub page. And, in this post, we are going to dive deeper into how the Zabbix Alertmanager integration works.

When we first began working on this integration, we looked at a similar project made by Gmauleon. It’s a really great project that we took a lot of inspiration from, but we quickly realized that we needed to make some major changes. Our project is written in Go and released as a standalone binary, which we called zal. The integration consists of 2 separate commands:

  1. zal prov command, which converts Prometheus Alerting rules into Zabbix Triggers.
  2. zal send command, which listens for Alert requests from Alertmanager and sends them to Zabbix.

Alert provisioning

The zal prov command is used to create Zabbix Triggers from Prometheus Alerting rule definitions. It’s a simple executable binary. Almost like a shell script, it runs to completion and returns exits, with status code 0 upon success. Otherwise, it will fail and print an error message. This feature opens up many different deployment options. You can deploy alert provisioning using Cron Job, which periodically checks configuration and creates trigger. CI Job, which would create triggers on alert configuration change or a regular bash script.

For our customers, we recommend Alert provisioning as a part of a Gitlab Continuous Integration job. So, we decided to store all Prometheus alerting configuration in one git repository. Developers create a change in alerting rules via Pull Request in Git, then CI job runs promtool check rules command, which validates configuration. Once Pull Request is merged, we will automatically provision alerting rules into Zabbix. Here is an example of .gitlab-ci.yml:

stages:
check
provision
check-alerts:
stage: check
image:
name: prom/prometheus:v2.8.0
entrypoint: ["/bin/sh", "-c"]
script:
- promtool check rules /.yml
provision-rules:
stage: provision
image:
name: devopyio/zabbix-alertmanager:v1.1.1
entrypoint: ["/bin/sh", "-c"]
script:
- zal prov --log.level=info --config-path=zal-config.yaml
--url=http://ZABBIX_URL/api_jsonrpc.php
--prometheus-url=http://PROMETHEUS_URL
only:
- master

Getting started

In order to run zal prov, you will need to set up Zabbix User. This Zabbix User has to have access to Zabbix API. Also this user requires elevated permissions to update Hosts, create Host Items, Triggers & Zabbix Applications. You can read more about Zabbix API & user permissions in Zabbix API manual.

When you first try it out, we suggest that you manually create a Host Group along with some empty Hosts. After that, create a Zabbix user, and allow this user to access that Host Group and enable Zabbix API access. Be sure to make note of your configuration, though, as you will need to provide these values to zal prov via --user, --password, --url flags or ZABBIX_USER, ZABBIX_PASSWORD, ZABBIX_URL environment variables. We recommend setting user credential data via environment variables, in order to keep them secret.

Configuring Hosts

After empty hosts are created and a user is set up, you need to specify Host configurations. It’s a simple YAML configuration file. Take a look at this example:

 - name: infrahost
hostGroups: INFRA
tag: prometheus
deploymentStatus: 0
itemDefaultApplication: prometheus
itemDefaultHistory: 5d
itemDefaultTrends: 5d
itemDefaultTrapperHosts: 0.0.0.0/0
triggerTags:
INFRA: ""
alertsDir: ./infra

- name: webhost
hostGroups: WEBTEAM
tag: prometheus
deploymentStatus: 0
itemDefaultApplication: prometheus
itemDefaultHistory: 5d
itemDefaultTrends: 5d
itemDefaultTrapperHosts: 0.0.0.0/0
triggerTags:
WEBT: ""
alertsDir: ./web

In this example, we create two Zabbix hosts. One is named infrahost and placed in the host group,INFRA. Additionally, we add a prometheus tag to this host, and store only 5 days worth of history. This configuration will be shown in Zabbix web UI. Forinfrahost we will provision Zabbix triggers from Prometheus alerts in the./infra directory. Lastly, we add the INFRA tag on those triggers.

Similarly, we do the same for any host named webhost in the host group WEBTEAM, and provision alerts from the ./web directory. Multiple hosts and multiple alert directories allows us to separate teams as well as their alerts. In this case, we have an Infrastructure team, which will see their alerts in infrahost host, along with a Web developer team, which will get their alerts in the webhost Zabbix host.

Alerting configuration

In alertsDir we expect Prometheus Alerting rules to be saved in files ending with .yml or .yaml extensions. There are some special rules when creating Alerts in Zabbix, which provide a more native Zabbix experience for Prometheus Alerts.

Here are the rules:

  1. If the Prometheus URL is configured, we setup a Trigger URL to link to Prometheus Query (Only if URL is shorter than 255 symbols, as Zabbix doesn’t support longer URLs).
  2. Trigger’s Comment field is set from Alert’s summary, message or description annotation.
  3. Trigger’s Severity is configured via severity label. Severity label can have one of information, warning, average, high, critical values.
  4. If Alerting rule has special zabbix_trigger_nodata annotation, we set up a special Zabbix nodata trigger expression. Annotation’s value must be a number, which is the evaluation period in seconds.

Use configuration file via zal prov --config-path flag. In order to make triggers, link to Prometheus and add the --prometheus-url flag. You can get more information by executing zal --help and zal prov --help commands.

Alert sending

Once Alert provisioning has successfully completed, you can start sending alerts Zabbix. zal send command listens for alerts from Alertmanager, via webhook receiver and sends them into Zabbix, via Zabbix Sender Protocol. You can read more about the protocol and how it works in th Zabbix Trapper items section.

In order to run zal send, you will need to set --zabbix-addr to point to the Zabbix server trapper port. By default it is listening on Zabbix server’s 10051 port. You then need to configure--addr, which is addressed to listen for Alertmanager’s Webhook requests (default is 0.0.0.0:9095). Also, you will need to provide --hosts-path, which is pointing to the zal send host configuration file.

Hosts configuration file is used to route alerts to correct Zabbix hosts. Let’s say you have two hosts,infrahost for infrastructure alerts, and webhostfor web developer alerts. This would give you two mappings:

# Resolver name to zabbix host mapping
infra: infrahost
web: webhost

The first part of this configuration is actually alertmanager’s receiver name. In this example, alerts coming from Alertmanager’s infra receiver will go to infrahost Zabbix host, and web receiver’s alerts will go to webhosthost. This configuration needs to be inline with Alertmanager’s configuration. Let’s take a look at this Alertmanager’s configuration example:

global:

route:
group_by: ['alertname', 'team']
group_wait: 30s
group_interval: 2m
repeat_interval: 3m
receiver: infra
routes:
- receiver: web
match_re:
team: web
- receiver: infra
match_re:
team: infra

receivers:
- name: 'infra'
webhook_configs:
- url: http://ZAL_SENDER_ADDR/alerts
send_resolved: true
- name: 'web'
webhook_configs:
- url: http://ZAL_SENDER_ADDR/alerts
send_resolved: true

This configuration routes alerts with label team: web to web receiver, and team: infra to infra receiver. Note that receivers must be configured via webhook_configs and, for each team, there must be a separate receiver configuration. In this example, we have one receiver for the Web developer team and one receiver for the Infrastructure team. ZAL_SENDER_ADDR is the address of zal send, which we configured via --addr flag.

If we fail to correctly route an alert (in the case where alert doesn’t have a team label), it will end up in infra receiver. If we forget to configure the Hosts configuration file, it will default to the value specified in --default-hostflag.

Conclusion

These are the main things you need to know in order to successfully run your Zabbix Alertmanager integration. In the next post, we will take a look at deployment considerations of Zal sender, and see how we can ensure the whole system runs reliably.

Need help integrating Prometheus with Zabbix? Contact us.