Nagios Two-Way Integration Guide

Nagios is one of the leading providers of open source and enterprise-grade IT infrastructure monitoring tools. Used by hundreds of thousands of users worldwide, Nagios allows its users to monitor their entire IT infrastructure, spot problems before they occur, detect security breaches and plan/budget for IT upgrades.

PagerDuty allows Nagios users to add PagerDuty’s on-call scheduling, alerts and incident tracking to their existing monitoring solution through the PagerDuty API.

By integrating PagerDuty into your existing Nagios monitoring solution, you can have Nagios alerts go directly to the the right person who can solve the issue.

The guide below describes how to integrate your Nagios 2, 3 or 4 installation with PagerDuty using our easy to install agent. Note that you must be logged in as root to complete the installation. You might need to slightly alter these instructions depending on your exact Linux distribution, Nagios configuration and Nagios version. If you are having trouble completing the installation, please contact us.

Note:  If you are using CentOS 5 or are using a proxy server, follow this guide.

Getting Started

If you don’t already have a PagerDuty “Nagios” service, you should create one:

  1. In your account, under the Configuration tab, select Services from the dropdown menu.Home_-_PagerDuty
  2. Click Add New Service.
    Services_-_PagerDuty1
  3. Enter a Service Name, choose an Escalation Policy. Start typing “Nagios” under Integration Type to filter your choices.
     Add a new PagerDuty Nagios service
  4. Click the Add Service button.
  5. Once the service is created, you’ll be taken to the service page. On this page, you’ll see the Service API key, which will be needed when you configure your Nagios server to send events to PagerDuty.
    PagerDuty Nagios service key

Setup for Debian, Ubuntu, and other Debian-derived systems:

  1. Install the PagerDuty agent as described here: http://www.pagerduty.com/docs/guides/agent-install-guide/.  Note:  If you are not on a supported OS or do not want to use the PagerDuty agent, you can use our perl script integration:  http://www.pagerduty.com/docs/guides/nagios-perl-integration-guide/
  2. Download pagerduty_nagios.cfg from GitHub:
     wget https://raw.githubusercontent.com/PagerDuty/pdagent-integrations/master/pagerduty_nagios.cfg
  3. Open the file in your favorite editor.
  4. Enter the service key corresponding to your Nagios service into the pager field. The service key is a 32 character string that can be found on the service’s detail page (Step 4 above).
  5. Move the Nagios configuration file into place:
    mv pagerduty_nagios.cfg /etc/nagios3/conf.d
  6. Add the contact “pagerduty” to your Nagios configuration’s main contact group. If you’re using the default configuration, open /etc/nagios3/conf.d/contacts_nagios2.cfg and look for the “admins” contact group. Then, simply add the “pagerduty” contact.
    define contactgroup{
         contactgroup_name admins 
         alias Nagios
         Administrators 
         members root,pagerduty ; Add pagerduty here
    }
  7. Download the pagerduty.cgi file for the two-way integration:
    wget https://raw.githubusercontent.com/mdcollins05/pd-nag-connector/master/pagerduty.cgi
  8. Edit the pagerduty.cgi file so that the command_file variable points to your Nagios command file. The path can be found with:
    grep "^command_file" /etc/nagios3/nagios.cfg

    If you don’t see any information returned, make sure the command_file variable is uncommented (doesn’t start with a “#”).

  9. Move the pagerduty.cgi file to the Nagios cgi-bin:
    mv pagerduty.cgi /usr/lib/cgi-bin/nagios3/
  10. Install the required Perl libraries to allow the script to work:
    apt-get install libwww-perl libjson-perl
  11. Make sure Nagios has external commands enabled: In your /etc/nagios3/nagios.cfg, check that variable check_external_commands equals 1 and that the variable command_check_interval is set to a reasonable value for your environment. The command_check_interval variable determines how often Nagios checks for external commands to run.
  12. Your webserver user (usually www-data for Apache on Ubuntu/Debian) needs to be able to write to the Nagios command file. The following commands enable this for the default command file location:
    /etc/init.d/nagios3 stop ## This will stop your Nagios service!
    dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw
    dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3
    /etc/init.d/nagios3 start
  13. Head back to your PagerDuty site, click on the Services link and click on the service you’d like to add the two-way integration to.
  14. Click on the Add Webhook button below the service settings and above the incidents list. 1-pd-webhook-nack
  15. Enter in the Name of the webhook, the URL and click Save. The URL should typically look like http://user:password@<ip-or-domain>/nagios3/cgi-bin/pagerduty.cgi
    Please note: Unless you’ve disabled it, the Nagios web interface requires a username and password. I’d highly suggest configuring a user that is able to run Nagios commands that is only used for the webhook. 2-pd-webhook-nack
  16. At this point, you should be all set. To test it out, you’d need to have an issue within Nagios that generates an incident. From there, acknowledging the incident should add a comment stating the incident has been “Acknowledged by PagerDuty”. 4-pd-nack-demo

Setup for RHEL, Fedora, CentOS, and other Redhat-derived systems:

  1. Install the PagerDuty agent as described here:  http://www.pagerduty.com/docs/guides/agent-install-guide/
  2. Download pagerduty_nagios.cfg from GitHub:
     wget https://raw.githubusercontent.com/PagerDuty/pdagent-integrations/master/pagerduty_nagios.cfg
  3. Open the file in your favorite editor.
  4. Enter the service key corresponding to your Nagios service into the pager field. The service key is a 32 character string that can be found on the service’s detail page.
  5. Move the Nagios configuration file into place:
    mv pagerduty_nagios.cfg /etc/nagios
  6. Edit the Nagios config to load the PagerDuty config. To do this, open /etc/nagios/nagios.cfg and add this line to the file:
    cfg_file=/etc/nagios/pagerduty_nagios.cfg
  7. Add the contact “pagerduty” to your Nagios configuration’s main contact group. If you’re using the default configuration, open /etc/nagios/objects/contacts.cfg and look for the “admins” contact group. Then, simply add the “pagerduty” contact.
    define contactgroup{ 
         contactgroup_name admins 
         alias Nagios Administrators 
         members nagiosadmin,pagerduty ; Add pagerduty here
    }
  8. Restart Nagios.
    service nagios restart
  9. Download the pagerduty.cgi file for the two-way integration:
    wget https://raw.githubusercontent.com/mdcollins05/pd-nag-connector/master/pagerduty.cgi
  10. Edit the pagerduty.cgi file so that the command_file variable points to your Nagios command file. The path can be found with:
    grep "^command_file" /etc/nagios/nagios.cfg

    If you don’t see any information returned, make sure the command_file variable is uncommented (doesn’t start with a “#”).

  11. Move the pagerduty.cgi file to the Nagios cgi-bin:
    mv pagerduty.cgi /usr/lib64/nagios/cgi/
  12. Set the executable permission on the pagerduty.cgi file:
    chmod +x /usr/lib64/nagios/cgi/pagerduty.cgi
  13. Install the required Perl libraries to allow the script to work:
    yum install perl-JSON perl-libwww-perl
  14. Head back to your PagerDuty site, click on the Services link and click on the service you’d like to add the two-way integration to.
  15. Click on the Add Webhook button below the service settings and above the incidents list. 1-pd-webhook-nack
  16. Enter in the Name of the webhook, the URL and click Save. The URL should typically look like http://user:password@<ip-or-domain>/nagios/cgi-bin/pagerduty.cgi
    Please note: Unless you’ve disabled it, the Nagios web interface requires a username and password. I’d highly suggest configuring a user that is able to run Nagios commands that is only used for the webhook. 2-pd-webhook-nack
  17. At this point, you should be all set. To test it out, you’d need to have an issue within Nagios that generates an incident. From there, acknowledging the incident should add a comment stating the incident has been “Acknowledged by PagerDuty”. 4-pd-nack-demo

FAQ

Can I have the two-way integration without having my Nagios install be available over the internet?

One of our customers has made a tool that will periodically poll our API for any acknowledgements on a Nagios incident and will acknowledge the service or host in Nagios. The tool can be found at:
https://github.com/zoosk/pagerduty_ack_to_nagios/blob/master/pd_ack_to_nagios_ack_poller.pl
Please note, this code wasn’t created by PagerDuty and cannot guarantee it’s functionality.

How do I setup Nagios to work with multiple PagerDuty services?

This is easy to do with the current integration, as a Nagios Service in PagerDuty is directly mapped to a “contact” in Nagios. By default, this contact is named “pagerduty” and defined in the pagerduty_nagios.cfg file.

In order to setup multiple services, just duplicate the existing contact definition and rename it (i.e. pagerduty_database, pagerduty_network, etc.). Then copy and paste the corresponding API Key from PagerDuty into the “pager” field. Don’t forget to restart your Nagios for the changes to take effect.

What if a Nagios event happens while my network is down?

If a PagerDuty server can’t be reached for any reason, events will be stored to an on-disk queue. The PagerDuty agent will attempt to re-send the events at one minute interval.

Since Nagios needs my external Internet connection to send failure reports to PagerDuty, how will I receive notification if our site loses external connectivity?

You should configure an external ping check service such as StatusCake or NodePing to monitor your site’s external connectivity. Of course, you can use PagerDuty to forward alerts from these services.

It doesn’t seem to be working. What’s going on?

Check that the pagerduty contact is getting the HOST or SERVICE NOTIFICATIONS within your syslog.  You can grep your syslog to see if the pagerduty contact is getting notified.  Here’s an example on an Ubuntu system (On RHEL flavors of unix, it’s /var/log/messages):

grep NOTIFICATION /var/log/syslog
May 28 18:20:57 ip-10-11-139-249 nagios3: SERVICE NOTIFICATION: pagerduty;localhost;Current Users;CRITICAL;notify-service-by-pagerduty;USERS CRITICAL - 3 users currently logged in

As you can see, the pagerduty contact was notified for this SERVICE NOTIFICATION.  If the pagerduty contact never shows up, that means that the pagerduty contact is not associated with notifications for the host/service in question.  If you’re using the default configuration, make sure that the pagerduty contact is a member of the admins contact group.

If the pagerduty contact is getting notified, check the agent logs at /var/log/pdagent/pdagentd.log.

Please contact us if you’re unable to sort out the difficulty.

What sort of Nagios messages does PagerDuty understand?

PagerDuty can process PROBLEMACKNOWLEDGEMENT, and RECOVERY messages. All other messages, including FLAPPINGSTART and FLAPPINGSTOP, are ignored. If you’d like PagerDuty to process additional Nagios messages, please let us know!

How can I customize my Nagios alerts?

If you would like to customize your Nagios alerts, follow our guide here.

All Our Integration Guides