Go Beyond MTTA and MTTR With PagerDuty Analytics
Do any of these sound familiar? One of your best engineers just put in notice that they are taking a job elsewhere because the on-call load while working for you is destroying their personal life—but you honestly thought things were fine. Or you recently walked into your boss’ staff meeting and they are calling for increased investment into new product development—but you’re sweating because you can’t get the resources you need to keep your existing services running smoothly. Or maybe a Sales VP reached out to say that customers are asking for refunds because your product has been down more than your SLA promises—but you thought that the incident count had decreased enough for it to not be an issue.
If any of these scenarios sound familiar, your organization might lack alignment and visibility into the metrics and measures that define your digital operations. Sure, you pay attention to incident counts, mean time to acknowledge (MTTA), and mean time to resolve (MTTR), but they don’t tell the full story of how your people, process, and technology resources impact the business outcomes related to your teams, company, and customers.
Simply put, you don’t know the full impact of digital operations on your business. And when you lack this information, things start going in different directions: people leave, additional investments in technology don’t improve customer experiences, and costs of operations rise.
But this doesn’t have to be the case.
Gain Insight into Your Real-Time Data With PagerDuty Analytics
A platform for action like PagerDuty can offer actionable insights about real-time work that goes far beyond MTTA and MTTR. The unique data that PagerDuty stewards for our customers, the combination of machine and human data comprised of the signals from monitoring tools, the people-team-service relationships, and the actions people take in real time when issues arise all open up a new world for digital operations analytics. That data offers insight across the spectrum, from major incident response effectiveness, team health, and service reliability, to the ROI of your devops tools, on-call pain, and, ultimately, business outcomes from operations.
At PagerDuty Summit 2018, we unveiled a new product to meet these needs: PagerDuty Analytics. Built on top of the unique and powerful data on the PagerDuty platform, PagerDuty Analytics surfaces modern digital metrics that go far beyond MTTA and MTTR, in both prescriptive and exploratory user experiences, so our customers can be aligned on the impact of operations on business outcomes.
Operational Review Metrics
Our first new offering as part of PagerDuty Analytics is a feature we call Operational Review Analytics. Mature digital organizations run regular operational reviews, and less mature organizations strive to. This new feature is a direct response to a pain we heard over and over again from our customers: running operational reviews with metrics is a difficult task that requires manual data extraction, summarization, and presentation—no matter how big or small the organization.
“The Operational Review feature of PagerDuty Analytics provides me with insight
into my on-call process from multiple viewpoints, from the health of my team
members to the health of my systems. PagerDuty Analytics also provides clarity
into what’s driving our on-call pain by connecting those pain points with how the
personal lives of our engineers are affected by unplanned interruptions.”
– Jeff Smith, Director of Production Operations, Centro
With Operational Review Analytics, we are making it dead simple to run operational reviews with data by offering prescriptive sets of metrics that align with the most common operational reviews: weekly on-call, monthly service, and quarterly business reviews. The prescriptive sets of metrics go beyond MTTA and MTTR, surfacing modern digital metrics and measures like:
- After Hours Interruptions
- Work Interruptions
- Time Without Major Incident
- Total Cost of Response
- Total Impact on Business Services
- Hours Spent in Response
- % of Total Time in Response
The metrics we surface in Operational Review Analytics are designed to offer insight into the impact of operations on your teams, your company, and your customers.
Publishing an Open-Source Framework for Operational Reviews
We even went one step further: In the same spirit of our open-source documentation on how to run a major incident response, we are also open sourcing documentation on how to run operational reviews. We took the learnings from interviewing dozens of operationally mature digital businesses, sprinkled in some of our processes at PagerDuty, and published a framework for how to run operational reviews, including who to involve, what metrics you’ll need, and how to make them actionable. That documentation can be found at reviews.pagerduty.com, where you’ll also find links to the GitHub repo where you can fork and customize for your own use.
We aren’t stopping with the highly prescriptive, targeted use case around operational reviews. We know you want to explore the metrics, measures, and trends that define digital operations, on your terms—and our next milestone will be to deliver a true exploratory analytics dashboarding experience.
Our curated dashboards will start you from a place of insights, whether you are looking to better understand your major incident response effectiveness, the reliability of your service operations, or the impact on business outcomes from adopting a new feature or process in PagerDuty. From there, you’ll be able to customize your view by filtering and grouping on any number of digital operations attributes, selecting the visualizations that best tell the story you want to tell, and building custom dashboards that act as a collection of the metrics you care about the most.
Additionally, we will be further utilizing our rich and unique data set of 10,000+ customers to provide you with advanced analytics and greater insights around peer and industry benchmarking to understand how your organization compares to others in your industry. These metrics help challenge your teams to improve efficiencies and focus on areas that need improvement—all while leveraging PagerDuty machine learning capabilities on top of real-time data to improve operational maturity over time.
Putting It All Together
Our early access Operational Review Analytics feature and upcoming new features—such as dashboarding, advanced analytics, and industry benchmarking capabilities—are all focused on surfacing new metrics that define digital operations, connecting actions to outcomes, and offer insight on the impact on your teams, company, and customers.
One way PagerDuty Analytics focuses on impact is by integrating with another new offering announced at PagerDuty Summit 2018, PagerDuty Visibility. The PagerDuty Visibility product brought two new concepts that we’ve pulled into the Analytics product: Business Services and Impact Metrics. With Business Services, PagerDuty Analytics closes the gap between technical service operations and customers experiences, while Impact Metrics bring an external business metrics into the mix to help understand the true impact of operations on your digital business. To learn more about these capabilities, see our blog post about how you can connect insights to real-time action with PagerDuty Visibility.
We’ve been lucky to have the guidance and input of a few customers serving as design partners for PagerDuty Analytics, and they see the value.
“PagerDuty Analytics will help us drive standard metrics across team and better understand incident priority based on customer impact”
– Andrew Hatch, Platform Engineering Manager, SEEK
Don’t just take our word for it, or our industry-leading customers’ words for it! If you’re ready to elevate your use of metrics for managing digital operations, head on over to the product page for PagerDuty Analytics to learn more. If you’re ready to start getting more insights from your PagerDuty data today, reach out to your PagerDuty Account Executive to request entry into the PagerDuty Analytics early-access program.