Chatting with PagerDuty's API
Guest blog post from Simon Westlake is the Chief Technical Officer for Powercode, a complete CRM, OSS and billing system designed for ISPs. Powercode is in use by over two hundred ISPs around the globe.
Since Powercode is used worldwide by a variety of different types of ISPs, we have integrations with all kinds of different third party services. These integrations encompass email account management, invoice printing, credit card processing, vehicle tracking, equipment provisioning and much more. However, most of these integrations only end up benefiting a handful of our customers, which is why it was a very pleasant surprise to see the heavy utilization of PagerDuty amongst our customers when we integrated it.
Considerations when integrating with 3rd party apps
I first heard about PagerDuty when we held a Powercode user meetup event in our hometown of Random Lake, Wisconsin back in 2012. One of our very passionate users, Steve, was talking to me about how he used Powercode for his ISP. I mentioned that one of the things that we find very challenging is dealing with after hours emergencies, as many of the smaller ISPs using Powercode don’t have the revenue to justify running a 24×7 network operations center (NOC), and that it’s a real pain hoping your buzzing phone wakes you up when you get an email from your network monitoring system letting you know that half the network is down. He quickly pulled out his laptop and logged into his PagerDuty account to show me what we were missing.
After he walked me through the interface and the feature set, I decided on the spot that it was critical for us to integrate PagerDuty into Powercode. However, we have three hard requirements that we always adhere to when we integrate third party services into Powercode:
- Does the service have a API?
- Is the API well written and documented?
- Does the company provide a testing/integration environment for developers?
We’ve been down dark roads before where we’ve decided to skip one of these requirements and it always turns into a long term nightmare. Poorly written APIs (or no API at all) and little support from the third party means we end up having to patch back together a tenuous integration and, if customers come to really rely on the integration, it’s an ongoing headache trying to keep it running. Thankfully, PagerDuty delivered on all three counts. The API was solid and well documented and they readily provided a testing environment for us to integrate the service into Powercode. When we look at new providers, I always cross my fingers for a consistent, REST, JSON-based API and thankfully, that’s what we got!
One of the things I really like when building an integration with a third party system is to find that the API exposed to us is the same API used to build the core system by the original developers and that certainly appeared to be the case with the PagerDuty API. We were really easily able to tie in everything we needed and the integration was smooth and painless.
Analyzing PagerDuty’s Integration API
There were a couple of decisions we had to make when working with the API. Prior to integrating PagerDuty, the only option for alerting in Powercode was to trigger an email. The triggering mechanism had a variety of configuration options such as:
- How long should this device be in an alert state prior to an alert being generated?
- How many times should Powercode repeat the notification?
- What is the frequency and amount of repetitions?
We quickly realized that maintaining this configuration didn’t make sense with the ability to setup your alerting parameters in PagerDuty. We also wanted a way to be able to maintain a history of alerts for devices in PagerDuty. Finally, we had to make some decisions about whether or not to set up two way integration with PagerDuty – if an alert is opened or modified in PagerDuty, should it manipulate anything in Powercode?
After much deliberation, we decided to not integrate two way communication. We wanted to maintain Powercode being the ‘master’ as far as the status of incidents and to encourage people to utilize the Powercode interface to manage their equipment. This left us with a problem to solve – what happens if someone resolves an incident inside PagerDuty while it is still alerting in Powercode?
To deal with this, we decided to trigger an incident creation or update in PagerDuty on every cycle of our notification engine, which is once every minute. PagerDuty log updates to an open incident without triggering another incident, as well as, automatically bundles open incidents that occur around the same time into one alert to reduce alerting noise. While this can create a long list of incident updates in PagerDuty, this gave us some benefits:
- If the status of a device changes, that change is reflected in the incident description PagerDuty. For example, if a router is alerting because CPU usage is too high and it then begins alerting because the temperature is too high, re-triggering the incident allows us populate this information into the description.
- If a user resolves an incident in PagerDuty that is not really resolved (the device is still in an alerting state), it will be re-opened automatically.
One of the nice things about the PagerDuty API is that it allows you to submit an ‘incident key’ in order to track the incident in question. We decided to use the unique ID in our database associated with the piece of equipment that is alerting as the incident key – this simplified the deduplication process and allowed us to maintain a history within PagerDuty of incidents that had occurred with that piece of equipment. It also made it easy to resolve or acknowledge incidents due to changes within Powercode – we always knew how to reference the incident in question without having to store another identifier. This seemingly small feature in the PagerDuty API really expedited our ability to get it integrated quickly. See an example below for how simple this is for us to do in PHP:
This gives us a nice list of descriptive incidents in PagerDuty:
Keep everyone in the know with PagerDuty
Our initial integration only used the ‘Integration API’ of PagerDuty. We reasoned that most of the other functionality would be controlled and accessed by users directly through the PagerDuty application, and it didn’t serve much purpose to recreate it all within Powercode. However, over time, we slowly found uses for the data in other sections. For example, we deployed a system within our NOC that uses the schedules and escalation policies section of the API to display to our local technicians who the current on-call person is and who to call in the event of an escalation. Our next plan is to implement the webhooks section of the API to be able to store logs inside Powercode that show who is currently working an incident – this allows us to give our customers the ability to get better real time data without needing to create accounts in PagerDuty for every member of their organization.
One thing I really like is finding that our customers respond positively and use the service. I believe that the PagerDuty integration in Powercode is the highest utilized third party service out of all the different services we have integrated. Once we started to show people how it worked, the response was universally positive. We even began using PagerDuty for our after hours support for Powercode itself – if you call into our emergency support line after hours and leave a voicemail, it opens an incident in PagerDuty to run through the escalation process!
We continue to recommend PagerDuty to our customers and I’m confident that their solid API and excellent support will mean ongoing integration in Powercode is a no brainer. Check out this integration guide to see how easy it is to integrate Powercode with PagerDuty.