Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
Join live and on-demand webinars for product deep dives, industry trends, configuration training, and use case-specific best practices.
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
We just held our annual conference, PagerDuty Summit 2018, where we shared new product announcements and demoed new capabilities. But while we always have big...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
PagerDuty’s July Hack Day presented another batch of amazing projects from our staff. One project in particular has a lot of future potential to provide our customers with helpful insights into their response times and of others.
Our data guru Kyle Napierkowski did some analysis on the longest and shortest mean time to response (MTTR) and median time to response across our customer base, and visualized it.
As PagerDuty is used by thousands of customers around the world, we’re in a pretty cool position to provide insights to our customers about trends in incident response times. This preliminary data is a starting point.
The graph below shows the median time to response—from the moment PagerDuty sends an alert to the moment it is resolved. As you can see, the majority of median times across our customer base are 20 minutes or less, with a fairly quick dropoff.
As a comparison, the graph below shows the distribution of mean time to response. It has a slower drop-off, with more customers in the tail, indicating that customers tend to have many short-resolution incidents (0-10 minutes) but also a handful of incidents with very long resolution times that skew the mean.
Kyle also looked at the customers with the highest MTTR. The graphs below show response time distribution for accounts that had the highest median and mean times to resolve an incident (customer names have been removed). For each account, the median or mean value is flagged, and a heatmap bar shows the response time for individual incidents. Brighter green = more incidents that took that amount of time to resolve.
The mean time to resolution shows much higher times—again, a result of a handful of incidents with exceptionally long resolution times.
Kyle’s project provides some interesting initial insights. It’s just a preliminary exploration into the metrics about customer averages, but it lays the foundation for some exciting future ideas. For example, one option could be providing mean response time, segmented by industry so you can better benchmark yourself against your peers.
What metrics/analysis do you use to evaluate your response times, and what are you curious about that you’ve never been able to dig into? Let us know—perhaps next hack day we can use your input to dig a little deeper.
“Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in...
We’re excited to share that we’re open-sourcing the tool we use to gather and transform the metrics from our managed DNS providers. We use DNSmetrics...
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2018