Generative AI for the PagerDuty Operations Cloud
When it comes to keeping your business’s lights on, you need to manage and orchestrate your operational activities, prioritize high-impact and urgent work, and maintain day-to-day precision. Trust is paramount during mission-critical, time-sensitive crisis response and the narrow margin for error means there is little room and low acceptance for generative AI hallucinations or false positives.
This is why our roadmap has always focused on innovation designed to make your job easier: innovation with a purpose. At PagerDuty, we have been working with AI and machine learning for years, and have become the industry leader in AIOps. And it’s through that lens that we’ve evaluated GenAI–not for its own sake, but by asking ourselves how it could unlock more value across the PagerDuty Operations Cloud.
From code co-pilots to incident response assistants, generative AI represents a tremendous opportunity. The ease and elegance of engaging with generative AI–its fundamental intuitiveness through a natural language interface–creates a step function opportunity to unlock the full potential of automation. There’s no question that automation has the potential to save time and money while increasing productivity and capacity, but automation initiatives can die under the weight of their own abstraction.
GenAI brings a consumer-style simplicity to enterprise-grade automation and makes the realization of automation’s potential much more real. The pace of software development will only accelerate, and more software means more complexity–which makes DevOps more important than ever.
Today, I’m excited to share the first three generative AI-supported capabilities PagerDuty is bringing to the PagerDuty Operations Cloud:
AI Generated Status Updates
When unplanned, interrupt work strikes, communication and coordination are essential to resolution. Industry best practices recommend regular status updates to stakeholders and leadership every 30 minutes (at least) to be sure the business responds with one voice. But crafting those updates takes time and bears its own cognitive load at a time when your teams are already at surge capacity. We have customers who tell us that during major incidents they have three people dedicated to just status updates.
This was a perfect place for us to kick off a generative AI deployment. With generative AI integrated into our Status Update feature, teams can save cycles on what to say and to whom–they can generate persona-based status update drafts with just a few clicks. The new capability leverages AI to process all data related to the current incident and auto-generate a summary, offering key insights on events, progress and challenges. This feature enhances incident management workflows and streamlines communication in addition to saving time, allowing your team to focus on the real work of resolution.
AI Generated Incident Postmortems
Postmortems are a staple of operational excellence and a best practice often driven by site reliability engineering (SRE)–it’s how you learn what went wrong, where you could improve, and most importantly, how to avoid making the same mistakes again and again.
Taking the time to document postmortems, however, can be challenging. It’s a drawn-out, manual (and occasionally emotional) process to collect all the relevant data points for review as a group.
But imagine you had a virtual set of team members shadowing the incident from start to finish, a team whose only job is to create a timely and unbiased draft of your postmortem report. That’s exactly what we can give you by applying generative AI to automate the generation of comprehensive post-incident draft reports.
As you’ll see in the video, once an incident is resolved, the user can elect to generate a postmortem review, triggering the real-time, time-consuming collection of all available data around the incident at hand (including logs, metrics, and relevant Slack or Microsoft Teams conversations). It then produces a detailed report that highlights key findings, root causes and areas of improvement. Additionally, PagerDuty generates a list of recommended action items tailored to prevent similar issues from occurring in the future.
Not only will this feature save time, but it will also provide a starting point for capturing crucial learnings, fostering a culture of continuous improvement and enabling the team to spend more time on future proofing–which brings us back to the criticality of the human-in-the-loop approach to unlocking the power of generative AI when you’re talking about mission-critical work.
Like the Status Update example above, automated incident postmortems require a person to provide expertise, judgment and oversight, validating and refining the report before releasing it for broader consumption.
AI Generated Process Automation
We’ve been using automation across the PagerDuty Operations Cloud platform since its inception, partnering with many of you to provide scripts and plugins to automate workflows that help you manage and resolve unplanned work more quickly. Our customers use us every day–whether in the cloud or on prem–for infrastructure automation as well as driving Ansible, Terraform and Power Automate. But if the scripts and tools don’t already exist, you have to do the heavy lifting yourself to actually code the script.
No longer. With generative AI, we’ve built a co-author for your automation needs. It’s like having an extra developer on your team whom you can task with researching how to do what you want to do and then create the automation for you. And best of all, it’ll do it in your favorite scripting language or easily transition from one language to another, so you ultimately have full control. We’re bringing low-code capabilities to what used to be a high-code experience, without losing any power or flexibility. For example, you can simply state, “Write an automated workflow that adds a specific user to a group within Okta. I should be able to specify the user by email and group at runtime,” hit the generate button, and watch the magic happen.
We are early in a journey where generative AI will accelerate learning, eliminate toil and increase our productivity, while freeing us up for more creativity.
With any new technology, there is risk. Managing that risk successfully is in our DNA, as our customers know from when we have introduced AI, machine learning and automation capabilities across the platform over the years. It’s why “human in the loop” is a central tenet in our AI work. And it’s why we’re moving quickly but keeping the tenets of fidelity, security and accuracy in mind as we build.
We want your feedback and input–as the possibilities are endless. What matters to you the most? Join the waitlist for these features, and sign up to get development updates. We expect to begin releasing these capabilities over the coming months.