Prometheus Integration Guide

Prometheus is an open-source systems monitoring and alerting toolkit. It features a multi-dimensional data model, a flexible query language to leverage this dimensionality, has no reliance on distributed storage, time series collection happens via a pull model over HTTP, pushing time series is supported via an intermediary gateway, targets are discovered via service discovery or static configuration, and has multiple modes of graphing and dashboarding support.

Important note for Prometheus Alertmanager v0.11 and later: Alertmanager now supports Events API v2. However, if you set the routing_key property and use v2, the integration type of the integration corresponding to the routing_key value must also be Events API v2. If you select Prometheus as the integration type in PagerDuty, you will need to use the Events API v1 type and set a value for the service_key property instead.

In PagerDuty

  1. Go to the Configuration menu and select Services.

  2. On the Services page:

    • If you are creating a new service for your integration, click Add New Service.

    • If you are adding your integration to an existing service, click the name of the service you want to add the integration to. Then click the Integrations tab and click the New Integration button.

  3. Select your app from the Integration Type menu and enter an Integration Name.

    If you are creating a new service for your integration, in General Settings, enter a Name for your new service. Then, in Incident Settings, specify the Escalation Policy, Notification Urgency, and Incident Behavior for your new service.

  4. Click the Add Service or Add Integration button to save your new integration. You will be redirected to the Integrations page for your service.

  5. Copy the Integration Key for your new integration.

On Your Prometheus Server

  1. Install the Prometheus Alertmanager if you don’t have it installed already. The Alertmanager is required for this integration, as it handles routing alerts from Prometheus to PagerDuty.

  2. Create an Alertmanager configuration file if you don’t have one already. You can find an example configuration file on GitHub.

  3. Create a receiver for PagerDuty in your configuration file. Give the receiver a name, such as the name of the team who will handle incidents for the receiver or the name of your integration in PagerDuty, paste the Integration Key you copied earlier in the service_key field, then save your configuration file.

    receivers:
    - name: YOUR-RECEIVER-NAME
      pagerduty_configs:
      - service_key: YOUR-INTEGRATION-KEY
    
  4. You can configure the default route in Prometheus to send all alerts which don’t match any custom routes to your new PagerDuty receiver. Here’s an example showing how you would configure the default route:

    route:
     group_by: [cluster]
     receiver: YOUR-RECEIVER-NAME
    
  5. You can also configure custom routes to send alerts to different receivers. For example, if you only want alerts with the severity of warning to be sent to PagerDuty, you would set a different default route and create a special warning route like this:

     routes:
      - match:
          severity: 'warning'
        receiver: YOUR-RECEIVER-NAME
    
  6. Thanks to the Prometheus Alertmanager’s powerful routes and receiver configuration options, you can configure multiple receivers with different PagerDuty integration keys, and different routes to send specific types of alerts to different receivers.

    Here’s an example configuration which sets up a route that captures alerts for a database service and sends them to a receiver linked to a service that will directly notify the DBAs I have in PagerDuty, while all other alerts will be directed to a default receiver with a different PagerDuty integration key:

    route:
     group_by: [cluster]
     receiver: DEFAULT-RECEIVER
     group_interval: 5m
     routes:
      - match:
          service: database
        receiver: DATABASE-RECEIVER
    
    receivers:
    - name: DEFAULT-RECEIVER
      pagerduty_configs:
      - service_key: PRIMARY-INTEGRATION-KEY
    
    - name: DATABASE-RECEIVER
      pagerduty_configs:
      - service_key: DATABASE-INTEGRATION-KEY
    
  7. Start the Alertmanager, or restart it for your configuration changes to take effect if was already running.

  8. Congratulations! Prometheus will now be able to trigger and resolve incidents in PagerDuty. You can verify this by triggering a test incident using the following curl command:

    curl -d '[{"labels": {"Alertname": "PagerDuty Test"}}]' http://localhost:9093/api/v1/alerts
    

FAQ

Will PagerDuty incidents be resolved when an alert is resolved in Prometheus?

Yes, as long as the send_resolved configuration option is not set to false. The default value is true, so there’s no need to specify send_resolved: true to have PagerDuty incidents be resolved automatically.

Also note that resolve notifications may take up to the the next group_interval to be sent, and only a “best effort” is made to send the notification to PagerDuty according to the Prometheus Team.

I only get one notification for multiple different Prometheus alerts; how do I fix this?

Try adjusting the match and group_by options for your PagerDuty route. The deduplication key (a.k.a. incident key), which is used to determine whether alerting events concern a unique issue, is generated based on these options. If a series of alerts have the same values for the properties in group_by, they will have the same value for deduplication key and thus will be merged into the earliest existing open alert/incident (rather than triggering new ones).