Let’s Talk AIOps: Part 2: Things to Think About & the PagerDuty Approach
This is the second in a two-part blog series about AIOps where I sit down with Julian Dunn, Director of Product Marketing at PagerDuty, to level-set on the hot DevOps topic.
The first post discussed whether AIOps was just marketing fluff and whether ITOps actually has an AIOps problem. Let’s continue…
Q: If I were in the market to buy an AIOps solution, what should I think about?
JD: I think it’s critical to back up and ask: What are you trying to solve? In other words, customers should really spend the time thinking about the problems they’re trying to solve and consider how the solution would work for their specific situation: their organization, their resources, their infrastructure. Ask the questions around what’s actually required for deployment and what’s realistically able to make a meaningful impact in the short term.
Additionally, sci-fi author and science writer Arthur C. Clarke said, “Any sufficiently advanced technology is indistinguishable from magic.” This is especially true with a term like “AIOps” or “AI,” where it often falls into the area of “it’s cool technology” and the perception that this “magical” thing I don’t really understand is going to solve all my problems. Consider ELIZA, the “AI psychotherapist” from the 1960s, it was certainly seen as magic at the time, yet under the hood it was just a bunch of conditional logic statements. We risk the same thing happening today with AIOps.
That’s a big risk to customers, and they should watch out for that when evaluating tools in this category. As I said earlier, it’s important to remember that AIOps solutions are mathematical models that require training over time via data aggregation for them to be useful. Beware of vendors that won’t level with you and tell you that—and tell you how to structure your data and organization to be successful.
Q: What is PagerDuty’s approach to AIOps?
JD: When we think about AIOps both internally and externally, we think about it having 4 key criteria:
- It’s easy to get started. People looking to buy AIOps want something that will start alleviating their problems now, not five years from now. That means it shouldn’t require lengthy professional services engagements and can deliver immediate value.
- Decentralized configuration. With the way that ops teams are laid out at many organizations today, there is a true risk that efficacy and scale fall flat if the solution requires a central authority to train and release models or to make configuration changes. To truly help your teams, the solution needs to be configured for how your teams work—that’s the only way models will learn and adapt to be most impactful for your users.
- No data science required. An AIOps solution should allow teams to realize value from data without needing to hire data scientists or change the way teams already work.
- Bridges multiple operating models. Any solution must be bought with both centralized teams and decentralized teams in mind, without forcing either group into unfamiliar tools. Otherwise, it’ll just be more IT bloat and the true potential of AIOps won’t be met.
Q: What is PagerDuty’s answer to AIOps? Where is it going?
JD: At the core of our offering for AIOps is Event Intelligence, where we focus on combining two types of data.
First is the data that’s coming in from all of the different monitoring solutions an organization is using. We run algorithms on that data and combine it with human behavioral data. Human behavioral data in PagerDuty’s world is how teams act and respond to real events in their systems, including what they do at what time and how they manually act to group or associate alerts into incidents.
Second, we’re actively applying our machine-learning capabilities to help with noise reduction through deduplication and suppression, time-based and intelligent alert grouping, and false positive reduction. Our AIOps helps identify and surface related incidents and past similar incidents, as well as what action was taken last time to resolve so that teams handling the current incident can better understand the full context, leading to faster resolution.
Last, integrations with Rundeck, Ayehu, AWS Event Bridge, and Pliant.io help bring things like auto-remediation to life.
This is all available to our customers today and reflects the philosophy behind where we’re investing our product and development efforts. We want to democratize intelligence and make sure our machine-learning capabilities are as impactful as possible—and that’s by helping individuals on the front lines get the right context, at the right time so they can ultimately fix critical issues faster and have a better experience doing so.
As for the vision for the future, stay tuned: you’ll see more of this at Summit!
To further help reduce the noise around AIOps, Julian and I have collaborated to put together a webinar featuring our SVP of Product and Product Marketing, Jonathan Rende, addressing this very topic. “AIOps Explained: What It Is and How It Can Boost Real-Time Operations.” You can watch the on-demand recording on your own time to get a taste for how we at PagerDuty navigate the many different philosophies around what AIOps can be to formulate our perspective on the topic. In the webinar, Jonathan also shares ways you can evaluate technologies that promise AIOps to formulate ideas around where it can fit in your broader strategy.
This is just the tip of the iceberg for what PagerDuty has to offer when helping our customers leverage intelligence and automation to alleviate real-time work so teams can focus their time innovating instead of fighting fires. We also have some big product announcements coming at our annual conference, PagerDuty Summit, in just a few weeks, so if you haven’t already, register today for free to save your spot!