How To Integrate Nagios with PagerDuty

Nagios is one of the leading providers of open source and enterprise-grade IT infrastructure monitoring tools. Used by hundreds of thousands of users worldwide, Nagios allows its users to monitor their entire IT infrastructure, spot problems before they occur, detect security breaches and plan/budget for IT upgrades.

PagerDuty allows Nagios users to add PagerDuty’s on-call scheduling, alerts and incident tracking to their existing monitoring solution through the PagerDuty API.

By integrating PagerDuty into your existing Nagios monitoring solution, you can have Nagios alerts go directly to the the right person who can solve the issue.

The guide below describes how to integrate your Nagios 2, 3 or 4 installation with PagerDuty using our easy to install agent. Note that you must be logged in as root to complete the installation. You might need to slightly alter these instructions depending on your exact Linux distribution, Nagios configuration and Nagios version. If you are having trouble completing the installation, please contact us.

Note:  If you are using CentOS 5, follow this guide.

Getting Started

If you don’t already have a PagerDuty “Nagios” service, you should create one:

  1. In your account, under the Services tab, click Add New Service.
    SL-AddNewServiceButton
  2. Enter a Service Name, choose an Escalation Policy. Start typing “Nagios” under Integration Type to filter your choices.
     Add a new PagerDuty Nagios service
  3. Click the Add Service button.
  4. Once the service is created, you’ll be taken to the service page. On this page, you’ll see the Service API key, which will be needed when you configure your Nagios server to send events to PagerDuty.
    PagerDuty Nagios service key

Setup for Debian, Ubuntu, and other Debian-derived systems:

  1. Install the PagerDuty agent as described here:  http://www.pagerduty.com/docs/guides/agent-install-guide/.  Note:  If you are not on a supported OS or do not want to use the PagerDuty agent, you can use our perl script integration:  http://www.pagerduty.com/docs/guides/nagios-perl-integration-guide/
  2. Download pagerduty_nagios.cfg from GitHub:
     wget https://raw.githubusercontent.com/PagerDuty/pdagent-integrations/master/pagerduty_nagios.cfg
  3. Open the file in your favorite editor.
  4. Enter the service key corresponding to your Nagios service into the pager field. The service key is a 32 character string that can be found on the service’s detail page (Step 4 above).
  5. Move the Nagios configuration file into place:
    mv pagerduty_nagios.cfg /etc/nagios3/conf.d
  6. Add the contact “pagerduty” to your Nagios configuration’s main contact group. If you’re using the default configuration, open /etc/nagios3/conf.d/contacts_nagios2.cfg and look for the “admins” contact group. Then, simply add the “pagerduty” contact.
    define contactgroup{
         contactgroup_name admins 
         alias Nagios
         Administrators 
         members root,pagerduty ; Add pagerduty here
    }
  7. Restart Nagios.
    /etc/init.d/nagios3 restart

Setup for RHEL, Fedora, CentOS, and other Redhat-derived systems:

  1. Install the PagerDuty agent as described here:  http://www.pagerduty.com/docs/guides/agent-install-guide/
  2. Download pagerduty_nagios.cfg from GitHub:
     wget https://raw.githubusercontent.com/PagerDuty/pdagent-integrations/master/pagerduty_nagios.cfg
  3. Open the file in your favorite editor.
  4. Enter the service key corresponding to your Nagios service into the pager field. The service key is a 32 character string that can be found on the service’s detail page.
  5. Move the Nagios configuration file into place:
    mv pagerduty_nagios.cfg /etc/nagios
  6. Edit the Nagios config to load the PagerDuty config. To do this, open /etc/nagios/nagios.cfg and add this line to the file:
    cfg_file=/etc/nagios/pagerduty_nagios.cfg
  7. Add the contact “pagerduty” to your Nagios configuration’s main contact group. If you’re using the default configuration, open /etc/nagios/objects/contacts.cfg and look for the “admins” contact group. Then, simply add the “pagerduty” contact.
    define contactgroup{ 
         contactgroup_name admins 
         alias Nagios Administrators 
         members nagiosadmin,pagerduty ; Add pagerduty here
    }
  8. Restart Nagios.
    service nagios restart

FAQ

How do I setup Nagios to work with multiple PagerDuty services?

This is easy to do with the current integration, as a Nagios Service in PagerDuty is directly mapped to a “contact” in Nagios. By default, this contact is named “pagerduty” and defined in the pagerduty_nagios.cfg file.

In order to setup multiple services, just duplicate the existing contact definition and rename it (i.e. pagerduty_database, pagerduty_network, etc.). Then copy and paste the corresponding API Key from PagerDuty into the “pager” field. Don’t forget to restart your Nagios for the changes to take effect.

What if a Nagios event happens while my network is down?

If a PagerDuty server can’t be reached for any reason, events will be stored to an on-disk queue. The PagerDuty agent will attempt to re-send the events at one minute interval.

Since Nagios needs my external Internet connection to send failure reports to PagerDuty, how will I receive notification if our site loses external connectivity?

You should configure an external ping check service such as StatusCake or NodePing to monitor your site’s external connectivity. Of course, you can use PagerDuty to forward alerts from these services.

It doesn’t seem to be working. What’s going on?

Check that the pagerduty contact is getting the HOST or SERVICE NOTIFICATIONS within your syslog.  You can grep your syslog to see if the pagerduty contact is getting notified.  Here’s an example on an Ubuntu system (On RHEL flavors of unix, it’s /var/log/messages):

grep NOTIFICATION /var/log/syslog
May 28 18:20:57 ip-10-11-139-249 nagios3: SERVICE NOTIFICATION: pagerduty;localhost;Current Users;CRITICAL;notify-service-by-pagerduty;USERS CRITICAL - 3 users currently logged in

As you can see, the pagerduty contact was notified for this SERVICE NOTIFICATION.  If the pagerduty contact never shows up, that means that the pagerduty contact is not associated with notifications for the host/service in question.  If you’re using the default configuration, make sure that the pagerduty contact is a member of the admins contact group.

If the pagerduty contact is getting notified, check the agent logs at /var/log/pdagent/pdagentd.log.

Please contact us if you’re unable to sort out the difficulty.

What sort of Nagios messages does PagerDuty understand?

PagerDuty can process PROBLEMACKNOWLEDGEMENT, and RECOVERY messages. All other messages, including FLAPPINGSTART and FLAPPINGSTOP, are ignored. If you’d like PagerDuty to process additional Nagios messages, please let us know!

How can I customize my Nagios alerts?

If you would like to customize your Nagios alerts, follow our guide here.

What about a 2-way ack integration between Nagios and PagerDuty?

PagerDuty customers have built a bi-directional integration package for Nagios and PagerDuty making use of PagerDuty’s generic integration API and webhooks to send notifications from Nagios to PagerDuty and to keep alert acknowledgment status in sync. Learn more about it here.