Nagios Two-Way Integration Guide

Nagios is one of the leading providers of open source, enterprise-grade IT infrastructure monitoring tools. Used by hundreds of thousands of users worldwide, Nagios allows its users to monitor their entire IT infrastructure, spot problems before they occur, detect security breaches and plan/budget for IT upgrades.

By integrating PagerDuty into your existing Nagios monitoring solution, you can have Nagios alerts go directly to the the person on-call in your PagerDuty schedule. The benefit of the two-way integration is that users can acknowledge an incident in PagerDuty and the acknowledgment will be passed on to the relevant service or host in Nagios, meaning both systems will reflect the most current status of an issue.

The guide below describes how to integrate your Nagios 2, 3 or 4 installation with PagerDuty using our easy to install agent. Note that you must be logged in as root to complete the installation. You might need to slightly alter these instructions depending on your exact Linux distribution, Nagios configuration and Nagios version. Please contact our support team if you have any trouble completing the integration.

Note: If you are running Nagios on CentOS 5, you will need to use the Perl-based integration for Nagios instead of following this guide.

In PagerDuty

  1. Create a new Service by going to the Configuration menu and selecting Services, then click Add New Service.

  2. Enter a Name for your new service, select Nagios from the Integration Type menu, choose an Escalation Policy to use when the service receives an alert from Nagios, and select the desired Notification Urgency for incidents (available only with Basic or higher plans). Click Add Service when you are finished.

    Add a new PagerDuty Nagios service

  3. Once the service is created, you’ll be taken to the service page. On this page, you’ll see the Integration Key, which will be needed when you configure your Nagios server to send events to PagerDuty.

    PagerDuty Nagios service key

On Your Nagios Server

This guide includes steps for Debian-based (i.e. Ubuntu) and RHEL-based (i.e. CentOS, Fedora) Linux distributions. You do not need to execute all commands in this guide, only the ones for your type of system. Note that all commands provided are intended to be run as the root user.

  1. Install the PagerDuty Agent. The agent receives events from Nagios and sends them to PagerDuty using a queue, provides logging that helps troubleshoot any problems, and automatically retries sending alerts in the event of any connection failure (i.e. if your Nagios server temporarily loses connectivity).

    Note: The Agent does not run on CentOS 5 or lower, as it requires a newer version of Python than the version included with CentOS 5. Please use the Perl-based integration for Nagios on older operating systems.

  2. Download pagerduty_nagios.cfg from GitHub:
    wget https://raw.githubusercontent.com/PagerDuty/pdagent-integrations/master/pagerduty_nagios.cfg
  3. Open pagerduty_nagios.cfg in a text editor.
  4. Enter the integration key corresponding to your Nagios service into the pager field. The key is a 32-character string that can be found on the service’s detail page (step 3 in the PagerDuty section above).
  5. Move the Nagios configuration file into place.

    For Debian-based systems this is usually /etc/nagios3/conf.d:

    mv pagerduty_nagios.cfg /etc/nagios3/conf.d

    For RHEL-based systems this is usually /etc/nagios:

    mv pagerduty_nagios.cfg /etc/nagios
  6. Skip this step if you are using a Debian-based distribution. If you are using a RHEL-based distribution, you will need to edit the Nagios config to load the PagerDuty config. To do this, open /etc/nagios/nagios.cfg and add this line to the file:
    cfg_file=/etc/nagios/pagerduty_nagios.cfg
  7. Add the contact “pagerduty” to your Nagios configuration’s main contact group. If you’re using the default configuration, open /etc/nagios3/conf.d/contacts_nagios2.cfg (on Debian-based systems) or /etc/nagios/objects/contacts.cfg (on RHEL-based systems) and look for the “admins” contact group. Then, simply add the “pagerduty” contact.
    define contactgroup{
         contactgroup_name admins 
         alias Nagios Administrators 
         members root,pagerduty ; Add pagerduty here
    }
  8. Skip this step if you are using a Debian-based distribution. If you are using a RHEL-based distribution, restart Nagios:
    service nagios restart
  9. Download pagerduty.cgi for the two-way integration:
    wget https://raw.githubusercontent.com/mdcollins05/pd-nag-connector/master/pagerduty.cgi
  10. Edit the pagerduty.cgi file so that the command_file variable points to your Nagios command file. The path can be found by running the command grep "^command_file" /etc/nagios3/nagios.cfg (on Debian-based systems) or grep "^command_file" /etc/nagios/nagios.cfg (on RHEL-based systems).

    If you don’t see any information returned, make sure the command_file variable is uncommented (doesn’t start with a “#”).

  11. Move pagerduty.cgi to the Nagios cgi-bin.

    For Debian-based systems this is usually /usr/lib/cgi-bin/nagios3/:

    mv pagerduty.cgi /usr/lib/cgi-bin/nagios3/

    For RHEL-based systems this is usually /usr/lib64/nagios/cgi/:

    mv pagerduty.cgi /usr/lib64/nagios/cgi/
  12. Make pagerduty.cgi executable.

    For Debian-based systems:

    chmod +x /usr/lib/cgi-bin/nagios3/pagerduty.cgi

    For RHEL-based systems:

    chmod +x /usr/lib64/nagios/cgi/pagerduty.cgi
  13. Install the required Perl libraries for the script to work.

    For Debian-based systems:

    apt-get install libwww-perl libjson-perl

    For RHEL-based systems:

    yum install perl-JSON perl-libwww-perl
  14. Skip this step if you are using a RHEL-based distribution. If you are using a Debian-based distribution, you will need to make sure Nagios has external commands enabled. In /etc/nagios3/nagios.cfg, check that variable check_external_commands equals 1 and that the variable command_check_interval is set to a reasonable value for your environment. The command_check_interval variable determines how often Nagios checks for external commands to run.
  15. Skip this step if you are using a RHEL-based distribution. If you are using a Debian-based distribution, you will need to make sure that your web server user (usually www-data) is able to write to the Nagios command file. The following commands enable this for the default command file location:
    /etc/init.d/nagios3 stop ## Note: This will stop your Nagios service!
    dpkg-statoverride --update --add nagios www-data 2710 /var/lib/nagios3/rw
    dpkg-statoverride --update --add nagios nagios 751 /var/lib/nagios3
    /etc/init.d/nagios3 start
  16. In PagerDuty, go to your Nagios service and click the Add Webhook button below the service settings and above the incidents list.

    1-pd-webhook-nack

  17. Enter a Name for your webhook (i.e. Nagios), the URL and click Save.

    The URL will look similar to this: http://user:password@ip-or-domain/nagios3/cgi-bin/pagerduty.cgi

    Note: Unless you’ve disabled it, the Nagios web interface requires a username and password. We highly recommend configuring a user that is able to run Nagios commands that is only used for the webhook.

    If you go to this URL in your browser you should see 400 Requests must be POSTs. If you do not see this, check your web server logs for details on what happened when you tried to call this URL.

    2-pd-webhook-nack

  18. At this point, you should be all set. To test it out, you’d need to have an issue within Nagios that generates an incident. From there, acknowledging the incident should add a comment stating the incident has been “Acknowledged by PagerDuty.”

    4-pd-nack-demo

FAQ

Can I have the two-way integration without having my Nagios install be accessible over the internet?

The official two-way integration requires that your Nagios server be accessible over the internet to receive webhook calls from PagerDuty. If you wish to restrict access to your Nagios web server to specific IPs, you can find PagerDuty’s webhook IPs in our knowledge base: What are PagerDuty’s IPs for whitelisting and firewall purposes?

If whitelisting won’t work for you, Zoosk has created and shared a polling script you may be able to use to periodically check our REST API for any acknowledgements on a Nagios incident and acknowledge the service or host in Nagios.

Note: PagerDuty does not support the Zoosk polling script. This is a third-party script that was not created by PagerDuty and has not been tested by our team, so we cannot guarantee its functionality or compatibility in your environment.

How do I configure Nagios to work with multiple PagerDuty services?

This is easy to do with the current integration, as a Nagios service in PagerDuty is directly mapped to a “contact” in Nagios. By default, this contact is named pagerduty and defined in the pagerduty_nagios.cfg file. In order to configure multiple services, just duplicate the existing contact definition and rename it (i.e. pagerduty_database, pagerduty_network, etc.). Then copy and paste the corresponding Integration Key from PagerDuty into the pager field. Don’t forget to restart your Nagios for the changes to take effect.

What if a Nagios event happens while my network is down?

If a PagerDuty server can’t be reached for any reason, events will be stored to an on-disk queue. The PagerDuty agent will attempt to re-send the events when connectivity is restored.

Since Nagios needs my external Internet connection to send failure reports to PagerDuty, how will I receive notification if our site loses external connectivity?

You should configure an external ping check service such as StatusCake or NodePing to monitor your site’s external connectivity. Of course, you can use PagerDuty to receive alerts from these services as well.

The integration doesn’t seem to be working. What’s going on?

First, make sure you’ve installed the PagerDuty Agent, and that there were no errors from your package manager when attempting to install it. Failed installs (i.e. due to an incompatible distribution, such as CentOS 5) are the most common issue with the integration not working.

Check that the pagerduty contact is getting the HOST or SERVICE NOTIFICATIONS in syslog. You can grep your syslog to see if the pagerduty contact is being notified. Here’s an example from an Ubuntu system (on RHEL-based systems, syslog is at /var/log/messages):

grep NOTIFICATION /var/log/syslog
May 28 18:20:57 ip-10-11-139-249 nagios3: SERVICE NOTIFICATION: pagerduty;localhost;Current Users;CRITICAL;notify-service-by-pagerduty;USERS CRITICAL - 3 users currently logged in

As you can see, the pagerduty contact was notified for this SERVICE NOTIFICATION.  If the pagerduty contact never shows up, that means that the pagerduty contact is not associated with notifications for the host/service in question.  If you’re using the default configuration, make sure that the pagerduty contact is a member of the admins contact group. If the pagerduty contact is getting notified, check the agent log at /var/log/pdagent/pdagentd.log.

More troubleshooting tips can be found in our Nagios Troubleshooting Guide.

What sort of Nagios messages does PagerDuty understand?

PagerDuty can process PROBLEMACKNOWLEDGEMENT, and RECOVERY messages. All other messages, including FLAPPINGSTART and FLAPPINGSTOP, are ignored.

How can I customize my Nagios alerts?

We have a guide for Customizing Notifications Sent to PagerDuty from Nagios to help you get started.