What Is HybridOps? For years, traditional infrastructure provisioning and management followed a specific operating model that depended on Network Operations Centers (NOCs) to process operational...by Julian Dunn
May 6, 2019
As operational complexity accelerates, our customers are realizing that it’s impossible to manage their services or innovate for their business without a mechanism to make sense of that complexity.
That’s why our March product update focuses on Event Intelligence, which is all about turning chaotic monitoring data into actionable insights so that teams can work smarter and focus on the things that matter.
In this post, we’ll share all the enhancements we’ve made to our Event Intelligence product, as well as some best-practice tips on how you can start minimizing noise today.
Last June, we launched Event Intelligence to help organizations tackle existing event management problems with a modern approach. As a quick overview, here are the core pieces of value that Event Intelligence provides:
Unique context at your fingertips. It’s hard to make sense of all the events and alerts impacting your service. By intelligently grouping related alerts into a single incident, as well as surfacing information on how similar incidents were resolved in the past, Event Intelligence provides responders with all the context they need to quickly triage and resolve the problem at hand.
Powerful noise reduction. By suppressing and filtering non-actionable alerts, and grouping related ones, Event Intelligence prevents unnecessary wake-ups and distracting interruptions when responders are trying to fix an issue.
Automation to scale your team. We empower teams with a flexible and team-oriented model to manage all of their events and alerts. Rather than manually managing prioritization, assignment, routing, and filtering, teams can use automation rules, machine learning, and our powerful APIs to drive the perfect workflow for any situation.
With unique telemetry across machine signals and human response data, Event Intelligence shortens the path from signal to resolution, and helps teams improve their operational maturity. And with this month’s updates, this product has only gotten more powerful.
Intelligent Alert Grouping uses machine learning to automatically group incoming alerts into incidents to prevent phones blowing up with duplicate pages so responders can focus on the problem at hand. We’ve tweaked the algorithm to better understand brand-new alerts and group them together more frequently to reduce even more noise. This change is automatically active for all Event Intelligence and Enterprise accounts.
Before enabling Intelligent Alert Grouping on a service, your team may want to understand how alerts will be grouped into incidents. With alert grouping previews, service owners can better understand potential noise reduction and grouping behavior before activating Intelligent Alert Grouping on a particular service. This is now available to all Event Intelligence and Enterprise accounts and trials.
Several of the improvements we’ve made to Event Intelligence are in the area of event automation. We’ve added new capabilities to make it easier than ever for teams to manage and automate their monitoring events.
With the new Recurring Event Rules feature, admins can now handle planned maintenance or known downtime by detailing specific times in the future in which they will be active. This prevents admins from having to manually create and re-create rules they want to occur on a recurring schedule.
The Disable Event Rule feature also helps admins more easily configure event rules, especially in larger accounts, by allowing a rule to be temporarily disabled with one simple click. One-click rule copying is also now Generally Available.
Finally, the Global Event Rules API enables admins and developers to automate the management of their global event rules, with actions such as disabling and copying an event rule within the UI and more. Check out the API documentation.
Using PagerDuty Event Intelligence, customers have been able to reduce hundreds or thousands of alerts into a handful of actionable incidents, saving time and frustration while mitigating business disruption.
Here’s a quick checklist of additional easy ways to reduce alert noise, and the PagerDuty capabilities that can help with that.
If you’re interested in the benefits of intelligent alert grouping, advanced event automation, similar incidents, and more, contact us here to trial Event Intelligence for free today.
Additionally, we regularly recap everything that’s new with product, integrations, and more in our platform release notes, so be sure to check them out. If you have any product feedback, we’d love to hear it! Shoot us a note at firstname.lastname@example.org or check out our Knowledge Base to learn more.