Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
We've created a maturity model to assist on the journey to digital operations excellence. Take our short assessment to find out where your team falls!
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
Have you ever worked on a team where it was a challenge to give constructive feedback or confidently share ideas? At PagerDuty Summit 2018, Patrick...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
This is a guest blog post about setting up IT operations software for startups written by Anthony Gibbons, the Operations Manager at Airhead Education. Airhead Education is a UK-based company that helps schools harness the power of cloud-based learning.
I joined a small but ambitious startup called Airhead Education in February 2014 as their Operations Manager. Airhead provide an affordable, cloud-based learning environment that ‘plays well with others’, which is to say that we integrate with whichever technologies our customers wish to use.
I had spent the previous two years working as an Application Support Specialist for one of the largest firms in the financial sector. Whilst the work had been enjoyable, and the people brilliant, I had a craving to get back into an operations role, which I felt was where my true strengths lay and was what I was best at.
At the start of 2014, Airhead were at a point where they needed to get serious about monitoring and supporting their growing infrastructure in Microsoft Azure. I was still in touch with a former colleague who was a founding employee of the company. An initial conversation over a few beers eventually progressed to a job offer that I gladly accepted.
What next? I must confess, I found the prospect of setting up our infrastructure monitoring and notification system from scratch a little daunting. Due to the company’s position as a startup, I also had a relatively small budget with which to do it. In the past, I had mainly tuned and tweaked existing infrastructure monitoring tools. My initial instinct was not to waste time reinventing the wheel. At Airhead, we have a ‘cloud first’ attitude, always seeking to integrate with best of breed cutting edge technologies for our customers. I decided to carry this philosophy through to backend operations and support. I had thought that budgetary constraints may have an impact on the quality of tools and services I would be able to use. I was completely wrong! With the advent of cloud services and companies willing to integrate with each other, it is now entirely possible for a small startup to use the same monitoring tools as industry stars such as Airbnb, Pinterest and Path.
Within a week or so, I was up and running with Microsoft SCOM, Site 24×7 for external monitoring and New Relic for application monitoring. We also set up a status page on StatusPage.io. Initially, alerts were generated and sent to our email addresses. Status updates were set manually on our status page if something went wrong. This was OK for a while but eventually emails got missed, our status page wasn’t always updated quickly enough and so on. We had monitoring down pretty well but we were way short on our notification solution. I wasn’t too keen on lugging a pager about again and I was even less keen on the associated costs. Then I found PagerDuty via a New Relic partner promotion. I signed up for a trial and all of my prayers were answered! PagerDuty would integrate with all of my monitoring solutions and alert the right people when things went wrong.
It probably took me an hour to integrate all of my services with PagerDuty. Very quickly, I was able to generate meaningful alerts to the iOS app that my colleague and I had installed on our existing phones. Escalation policies were flexible and easy to visualise. We went for something quite simple and effective: general alerts would go to DevOps guys whilst a full outage would escalate to all staff. On call rotas were easy to configure so we could share the pain of late night wake up calls. Speaking of wake up calls, what better way to be alerted than with a sad trombone or a barbershop quartet style rendition of ‘The server’s on fire’? The push sounds for the iOS app keep getting better and better!
After a couple of weeks of use, it was time to investigate some of the more advanced features. If an incident or outage occurred within our app, I was now confident that the right people would be notified. But what about our customers? As I mentioned previously, we use StatusPage.io for our custom status page. By integrating StatusPage.io with the Pager Duty API, we have been able to create rules that will change the public status of our service if certain events are triggered from Pager Duty. This lets our customers know as soon as we do if there is a major issue affecting our platform. In addition to this, we have integrated PD with HipChat so we can quickly and easily view a summary of all alerts. This can be extremely useful when trying to understand an incident timeline.
One of the best things about PagerDuty is the rate at which the service continues to improve and evolve. Just one of the new things I will look at this month is ‘Rich Incidents‘, which gives me more context into alerts by embedding links and images into alerts. Oh, and hopefully we will get even more push alert sounds for the app. Keep them coming!
The best thing about PagerDuty is that it, like Airhead, ‘plays well with others’. They occupy an important role in operations and they’re happy to integrate with other fantastic cloud services. With affordable, flexible and continuously improving services such as these, it is a great time to be involved in IT operations. What was I worried about?
This is a guest post by Ilan Rabinovitch, Director of Product Management at Datadog. The convergence of rapid feature development, automation, continuous delivery, and the shifting...
Dynamic Notifications are now out in the wild! With our launch today, we give PagerDuty users the power to dynamically adjust how they are notified...
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2019