PagerDuty Blog

Less Is More With Intelligent Response and Automation

For better or for worse, we have become a society obsessed with efficiency. There are watermarks of it in every corner of our lives—from digital banking and e-commerce apps to smart thermostats and doorbells. Yet, despite us using automation in almost all aspects of our personal lives, a joint study conducted by PagerDuty and Dimensional Research found that, in contrast, 90% of companies have little to no automation for issue resolution. That begs the question: Since we use automation daily in our personal lives to make simple things easier, why aren’t we using it at work as well?

Work Smarter, Not Harder

Despite being dubbed the “technology gurus” within their own companies, you’d be surprised at just how many IT professionals resort to cumbersome, manual processes to get work done.

Many IT teams aren’t making the best use of their time or resources. Instead, they’re still thumbing through a handbook to determine how to respond to an issue or manually combing through past incidents to see what might be related. However, one of the easiest ways to improve team efficiency and protect revenue is with intelligent response and automation, which includes automating repeatable procedures so that you can reallocate valuable resources elsewhere.

Our latest enhancements from the Spring 2020 product launch approach intelligent response and automation in three ways: (1) coordinating people and processes, (2) bringing response teams the right context and information when they need it, and (3) putting easy-to-use automation at their fingertips so they can quickly diagnose and fix issues. Let’s take a closer look at how it works.

Innovation Deep-Dive

Intelligent Triage

As part of PagerDuty Event Intelligence and now available on mobile, Intelligent Triage prevents teams from duplicating work. Now you can use the Related Incidents feature to establish a single source of truth for incidents that impact multiple teams. Related Incidents extends Event Intelligence’s machine learning capabilities beyond noise reduction and delivers to responders real-time contextual insights across services.

By examining concurrent incidents on other services that might be related to the issue at hand, responders can gain better insight into the breadth and scope of impact, avoid redundant communications, and ensure teams don’t step on each other while trying to solve problems. Intelligent Triage utilizes machine learning that makes it possible to:

  • See what’s happening right now across the business
  • Understand if issue is local or impacts others
  • Recruit the right teammates and work together to fix the problem
  • Improve MTTR

Reopen Incident

Occasionally, responders accidentally close an incident during a response even though the incident wasn’t actually resolved. Or sometimes they resolve a major incident under the assumption that the incident has concluded, but soon afterward, notice ongoing symptoms stemming from the same cause via other channels like monitoring and customer support.

It’s ok, mistakes happen! We’re all human. That’s why PagerDuty released a new feature (available for early access) that allows responders to reopen incidents without triggering new alerts. This simple but important action increases responder flexibility, reduces duplication of incidents, and makes it faster and easier to re-mobilize response teams if symptoms of a major incident are found to have reappeared. You can also mirror concurrent reopen actions in adjacent tools like ServiceNow in PagerDuty.

Runbook Automation Integrations

Another way to get some quick wins is automating all of the manual or partially automated procedures you have captured in operational “runbooks” and connecting them to your automated and intelligent incident response processes. Our newly released integrations with Rundeck, Ayehu, and Pliant for runbook automation—in addition to our existing integration with Amazon EventBridge—allow you to automate manual IT response play workflows and improve communication and resolution speed within response teams.

Let’s look at an example of how incident response using runbook automation from the Rundeck integration could play out:

It’s peak operating hours for your e-commerce business and your server goes down, preventing customers from being able to check out and purchase items in their cart. PagerDuty has alerted the right people that there is an incident.

Now what? How do you enable your responders to take quick action to diagnose and resolve the incident? If they have to escalate to a colleague or another team, you are losing time. If they have to navigate a wiki or dig through manual runbooks, you are losing time.

Instead, give your responders safe, self-service access to the automated operations procedures they need to take action. Rundeck’s runbook automation enables any of your responders to execute diagnostic or repair procedures safely, just like your subject matter experts would, so that your incidents are shorter and require fewer escalations.

Out of the box, the integration between Rundeck and PagerDuty lets you:

  • Automatically trigger Rundeck Jobs at the start of a PagerDuty incident (e.g., starting diagnostics or trying repair actions even before the first responder logs in to PagerDuty)
  • Trigger Rundeck Jobs during an incident using custom actions in the PagerDuty web or mobile UI.
  • Have your Rundeck Jobs automatically update incident notes/timelines in PagerDuty
  • Trigger a PagerDuty incident if a Rundeck Job fails

In this scenario, utilizing runbook automation helps the e-commerce business deploy their resources more efficiently, improve MTTR, and protect even more revenue from being lost as a result of the outage.

Make Intelligent Response and Automation a Reality

These innovations lighten the burden on responders by reducing duplication of work and allowing machines to automate manual tasks. By giving teams the ability to do more with less, PagerDuty helps organizations improve company resilience and protect customer relationships and hard-earned revenue.

If your organization could benefit from any of these tools, be sure to check out our free trial or reach out to your account manager to set up a personalized demo.