Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
We've created a maturity model to assist on the journey to digital operations excellence. Take our short assessment to find out where your team falls!
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
In The Hitchhiker’s Guide to the Galaxy, a group of scientist mice built a mega-computer named “Deep Thought” to Answer “The Ultimate Question of Life...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
When managing your ITOps team, it’s important to establish Key Performance Indicators (KPIs) based on real and actionable data. As the ITOps landscape evolves, your team’s responsibility and potential size will grow with it, paving the way for more resources and users to manage, and a greater variability across compute environments, configurations, and security. Now more than ever, you need to have a platform that provides a clear picture of your team’s performance and overall effectiveness.
Organizations adopt an incident management platform to move incident response from a reactive process to a proactive one. The solution can tell you what breaks and deliver the data to support a fast resolution. This value is obvious. But when I started working with PagerDuty, I found that there was a hidden gem that took the platform beyond just incident management. I was able to leverage the built-in analytics to measure my team’s performance and effectiveness with a new level of transparency.
source: PagerDuty Analytics Dashboard
With the power of PagerDuty data, we were able to set up a system to reward those who responded to incidents.
From time to time, an on-call engineer ends up skipping calls or notoriously missing calls for high urgency incidents. This not only brings down the team’s effectiveness, it also ends up forcing the accountable members of the team to shoulder more of the burden. By analyzing user-centric incident management analytics, we were able to quickly discover which team members not only acknowledge and respond to incidents but the percentage of team members that participated and executed their duties during a particular time period. Of course the opposite is true as well, but we choose to lead by positive example.
If you open up the data to your team, it can be used for self-policing as well. For example, if a user is exhibiting high percentages of escalations based on no activity or “timeout escalations,” this visibility can help the team proactively take the right measures to tighten effectiveness before it causes an incident response problem that could affect SLA.
Another problem we had was that incidents were acknowledged and resolved in a vacuum — the lack of analytics and reporting allowed for engineers to answer incidents without the rest of the team knowing they had, without any idea of what had happened. This creates a vicious cycle for ITOps teams, as the top performers can become beleaguered with no incentives to continue their excellent work, and in some cases, it can lead to engineer turnover. It also leads to critical, lost opportunities to learn from historical issues.
Based on the analytics, we built an incentive program around who acknowledged and resolved the most incidents each month. This helped stimulate competition for engineers to be more productive.
Another example could be to reward your ITOps escalation team if they keep MTTA under one minute and the MTTR under one hour (or whatever metrics make sense for your team). Not only do these incentive programs stimulate your engineers and entire escalation team, they contribute to your effectiveness in maintaining your SLAs.
Below are some ideas on how to start incentivizing your incident response team:
As the service level demands on ITOps continue to become more and more stringent, not only are operational challenges greater, management challenges are as well. If ITOps teams leverage existing tools to proactively learn, measure and motivate their team, they benefit in both operational efficiency and team productivity. Incident management analytics in platforms like PagerDuty have become an invaluable resource to us, not only to confront these growing demands on IT, but in streamlining the effectiveness and increasing the satisfaction of team members. It has given us more transparency, better learning, and a great way to measure and motivate every member of our team.
Ready to give PagerDuty a try? Sign up for a free trial.
This blog was co-authored by myself and Simon Darken. Once a year, PagerDuty’s SREs get together for a three-day, in-person offsite. With the team spread...
A release is a set of customer visible and operational features that together provide a completely new or improved product capability. It’s something that’s meaningful...
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2018