This is a guest post by Ilan Rabinovitch, Director of Product Management at Datadog. The convergence of rapid feature development, automation, continuous delivery, and the shifting...by Ilan Rabinovitch
August 24, 2017
As the hub that centralizes DevOps and IT Operations tool data, many organizations depend on PagerDuty to notify them whenever any component of their IT infrastructure is behaving unexpectedly. If you’ve used PagerDuty before, you’ve likely had to deal with multiple incidents related to a single problem, each of which you get notified for. This typically happens if you have redundant monitoring systems configured or if a single point of failure or degradation causes a domino effect of multiple tools simultaneously firing off alerts.
To address this, we’ve introduced some significant changes to our data model by redefining the concept of an alert within PagerDuty, as an object that tracks the monitoring tool state. The use of alerts within PagerDuty is foundational to two exciting new capabilities — Alert Triage and Suppression.
With the availability of the new Alert Triage capability, you can group related alerts into a single incident object that enables true end-to-end incident management. Responders no longer get paged on individual, silo-ed symptoms. Instead, resolution workflows are now centered around an incident object that is truly representative of a real, service-impacting problem or outage. This capability redefines how customers can intelligently triage and interact with the data from their systems to reduce noise, improve cross-functional collaboration, and drive down resolution times.
Alerts will automatically be enabled on new PagerDuty services and you can begin using the new Alert Triage features immediately. For existing services where it makes sense to to have this configured, simply click on “Edit Service” and toggle on the option to “Create alerts & incidents”.
When a service is configured to Create alerts and incidents, all actionable alerts will create a parent incident. To consolidate related alerts into a single incident, select two or more incidents on the incident list, press Merge, and select the incident for everything to be merged into.
When you’re merging multiple incidents together, you can easily edit the incident summary to accurately reflect the issue in question, so responders can quickly get up to speed.
There are many great benefits of Alert Triage when it comes to enabling a more seamless incident resolution workflow.
The use of alerts and the new Alert Triage capability is a critical building block for unlocking enhanced value within PagerDuty and is available to all customers at no additional cost. We highly encourage you to learn more by reading the following support articles:
Don’t hesitate to reach out to firstname.lastname@example.org if you have any questions or feedback, which we look forward to answering. We hope that with Alert Triage, you and your teams enjoy the benefits of optimized incident response