Skip to content

Latest commit

 

History

History
331 lines (228 loc) · 12.1 KB

alertmanager-matrix-config.md

File metadata and controls

331 lines (228 loc) · 12.1 KB

Alertmanager

Alertmanager is a tool that manages alerts from Prometheus, ensuring important notifications reach the right people. It helps organize and send alerts to channels like email or Slack. This guide will demonstrate how it can be set up with Matrix.

Prerequisites

  • A running instance of Ubuntu 22.04 LTS for setting up Alertmanager
  • A running instance of Prometheus. See here for more details.
  • Root or sudo privileges
  • Basic knowledge of the command line and Linux system administration
  • A (dummy) Matrix account for a bot user

Installation

Complete the following steps on newly created instance of Ubuntu 22.04 LTS

Step 1: Update the System

Before installing any new software, it's advisable to update your system with the latest packages. This ensures you have the most recent security patches and software updates.

sudo apt update && sudo apt upgrade -y

Step 2: Create a Alertmanager User

For security reasons, Alertmanager should not run as the root user. Create a user and group for the Alert Manager to allow permission only for the specific user.

sudo groupadd -f alertmanager
sudo useradd -g alertmanager --no-create-home --shell /bin/false alertmanager

Step 3: Create Alertmanager Directories

Move the Alertmanager binaries and configuration files to the appropriate directories and set the correct ownership to ensure Alertmanager runs under the correct user.

sudo mkdir -p /etc/alertmanager/templates
sudo mkdir /var/lib/alertmanager
sudo chown alertmanager:alertmanager /etc/alertmanager
sudo chown alertmanager:alertmanager /var/lib/alertmanager

Step 4: Download and Configure Alertmanager

Download the latest version of Alertmanager from the official website. Replace the URL with the latest version if necessary. The example below uses version 0.27.0.

pushd /tmp
wget https://github.com/prometheus/alertmanager/releases/download/v0.27.0/alertmanager-0.27.0.linux-amd64.tar.gz

Make sure to verify the integrity of the downloaded file using the sha256sum command. You can find the checksums on the download page.

sha256sum alertmanager-0.27.0.linux-amd64.tar.gz
tar xvf alertmanager-0.27.0.linux-amd64.tar.gz
cd alertmanager-0.27.0.linux-amd64.tar.gz

Copy the alertmanager and amtol files in the /usr/bin directory and change the group and owner to alertmanager. As well as copy the configuration file alertmanager.yml to the /etc directory and change the owner and group name to alertmanager.

sudo cp alertmanager /usr/bin/
sudo cp amtool /usr/bin/
sudo chown alertmanager:alertmanager /usr/bin/alertmanager
sudo chown alertmanager:alertmanager /usr/bin/amtool
sudo cp alertmanager.yml /etc/alertmanager/alertmanager.yml
sudo chown alertmanager:alertmanager /etc/alertmanager/alertmanager.yml
popd

Step 5: Create a Alertmanager Service

Create a systemd service file to manage the Alertmanager service. This file ensures Alertmanager starts automatically on boot and can be controlled using the systemctl command.

sudo nano /etc/systemd/system/alertmanager.service

Add the following content to the file:

[Unit]
Description=AlertManager
Wants=network-online.target
After=network-online.target

[Service]
User=alertmanager
Group=alertmanager
Type=simple
ExecStart=/usr/bin/alertmanager \
    --config.file /etc/alertmanager/alertmanager.yml \
    --storage.path /var/lib/alertmanager/

[Install]
WantedBy=multi-user.target

Reload the systemd daemon to apply the changes and start the Alertmanager service. Enable the service to start on boot.

sudo systemctl daemon-reload
sudo systemctl start alertmanager
sudo systemctl enable alertmanager

Step 6: Verify Alertmanager Installation

Ensure that Alertmanager is running correctly by checking the status of the service.

sudo systemctl status alertmanager

You should also be able to access the Alertmanager web interface at http://localhost:9093 locally or http://<your-ip>:9093 from a remote machine.

Prometheus

Prometheus rules are crucial for triggering alerts. These rules define the conditions under which Prometheus will send an alert to the Alertmanager. The Alertmanager then routes these alerts to the appropriate channels. More details will be provided later.

You can define multiple rules in YAML files according to your alerting needs. For demonstration purposes, we will create a rule that triggers an alert when a instance is not reachable.

Instance Down Alert

On your existing Prometheus instance, create a .yml file for this rule

sudo nano /etc/prometheus/instance_down_rules.yml

Insert the following into the newly created file

- name: alert.rules
  rules:
  - alert: InstanceDown
    expr: up == 0
    for: 5s
    labels:
      severity: critical
      webhook_url: 'test-alertmanager'
    annotations:
      summary: "Instance down"
      description: "Instance with Node ID: {{ $labels.instance }} has been down for more than 5 seconds."

This is a basic example to demonstrate the creation of alert rules. You can further customize the alert rules based on your specific monitoring needs. For more information on customizing Prometheus alerting rules, please visit the Prometheus alerting rules documentation

Update Prometheus Configuration

Once the rule has been created, it needs to be added to the Prometheus configuration, along with the details of the Alertmanager.

Open the Prometheus configuration file

sudo nano /etc/prometheus/prometheus.yml

Add alerting and rule_files as top level keys

alerting:
  alertmanagers:
    - static_configs:
        - targets:
            - <your-ip>:9093    # or localhost:9093 

rule_files:
  - "instance_down_rules.yml"

See the prometheus-alerts.example.yml file for how the prometheus.yml should look.

Restart Prometheus Service

After modifying the configurations, restart the Prometheus service using the following command.

sudo systemctl restart prometheus.service

Check the status of the service to ensure there are no errors in the above configurations using the following command.

sudo systemctl status prometheus.service

Matrix Bot Setup

Step 1: Retrieve Matrix (bot) account information

If you haven’t already, create a new user account using the Element client to be the bot on your Matrix home server. Note the bot user's access token, user ID, and home server URL using the instructions below.

  • User ID: Settings > General > Username (e.g. @botusername:matrix.org)

  • Homeserver url : Settings > General > Help & About > Advanced > Homeserver

  • Access Token: A long lived access token will be needed so that your Matrix bot can send notifications in the background.

    Copy and paste the following command in your terminal to get your access token. Make sure to replace <home-server-url , <your-bot-user-id and your-bot-account-password.

    curl -X POST "https://matrix-client.matrix.org/_matrix/client/v3/login" -H "Content-Type: application/json" -d '{"type":"m.login.password","identifier":{"type":"m.id.user","user":"<your-bot-user-id>"},"password":"<your-bot-account-password>"}'

    Your access token should be visible in the output with the this format: syt_abced... .

Step 2: Retrieve Matrix room information

On your user account, create a room which you can invite the newly created bot using its user ID. Take note of the room ID: Room options > Settings > Advanced > Internal room ID

Matrix Receiver

Receivers are the destinations for alerts after they have been processed by Alertmanager. To send alerts via the Matrix protocol, a Matrix receiver is required. We will use the matrix-alertmanager-receiver by metio. Alternatively, you can develop your own receiver.

Step 1: Create a dedicated user and directory for the receiver

Create the matrix-alertmanager-receiver user that will run this service, as well as the /etc/matrix-alertmanager-receiver directory to store the necessary configuration files.

sudo useradd --no-create-home --shell /bin/false matrix-alertmanager-receiver
sudo mkdir /etc/matrix-alertmanager-receiver

Step 2: Clone the repository

On the instance where you previously downloaded and configured Alertmanager, clone the matrix-alertmanager-receiver repository and navigate to its directory.

pushd /tmp
git clone https://github.com/metio/matrix-alertmanager-receiver.git
cd matrix-alertmanager-receiver

Step 3: Configure the receiver

Copy the config.sample.yaml file to a config.yaml file

cp config.sample.yaml config.yaml
sudo nano config.yaml

Enter the details collected from element into the configuration file. See matrix-rx-config.example.yml for a minimal example:

For full details on each key in the config.yaml file, please refer to the repository’s documentation.

Step 4: Build the Matrix Alertmanager receiver

In the matrix-alertmanager-receiver directory, execute the following command.

CGO_ENABLED=0 go build -o matrix-alertmanager-receiver

You may need to have, at least, Golang 1.21 installed.

sudo apt install golang-go

Step 5: Create and configure the Matrix Alertmanager receiver service

Copy the matrix-alertmanager-receiver executable files to the /usr/bin directory and change the group and owner to matrix-alertmanager-receiver. As well as copy the configuration file config.yaml to the /etc/matrix-alertmanager-receiver directory and change the owner and group name to matrix-alertmanager-receiver.

sudo mv /tmp/matrix-alertmanager-receiver/matrix-alertmanager-receiver /usr/local/bin/
sudo mv /tmp/matrix-alertmanager-receiver/config.yml /etc/matrix-alertmanager-receiver/config.yml
sudo chown matrix-alertmanager-receiver:matrix-alertmanager-receiver /usr/local/bin/matrix-alertmanager-receiver
sudo chown matrix-alertmanager-receiver:matrix-alertmanager-receiver /etc/matrix-alertmanager-receiver/config.yaml
popd

Create a systemd service file to manage the matrix-alertmanager-receiver service. This file ensures matrix-alertmanager-receiver starts automatically on boot and can be controlled using the systemctl command.

sudo nano /etc/systemd/system/matrix-alertmanager-receiver.service

Add the following content to the file:

[Unit]
Description=Matrix Alertmanager Receiver Service
After=network.target

[Service]
User=matrix-alertmanager-receiver
Group=matrix-alertmanager-receiver
Type=simple
ExecStart=/usr/local/bin/matrix-alertmanager-receiver --config-path /etc/matrix-alertmanager-receiver/config.yaml

[Install]
WantedBy=multi-user.target

Reload the systemd daemon to apply the changes and start the matrix-alertmanager-receiver service. Enable the service to start on boot.

sudo systemctl daemon-reload
sudo systemctl enable matrix-alertmanager-receiver
sudo systemctl start matrix-alertmanager-receiver

Step 5: Verify Matrix Alertmanager receiver setup

Ensure that matrix-alermanager-receiver is running correctly by checking the status of the service.

sudo systemctl status matrix-alertmanager-receiver

Configure Alertmanager Routes and Receivers

Open the alertmanager.yml file

sudo nano /etc/alertmanager/alertmanager.yml

Add the routes and receivers to the file. See alertmanager.example.yml for an example.

Restart the alertmanager service

sudo systemctl restart alertmanager

Now, you should have successfully setup Alertmanager to work with Prometheus, and it should be able to send alerts via Matrix.

References