This is a guest post by Ilan Rabinovitch, Director of Product Management at Datadog. The convergence of rapid feature development, automation, continuous delivery, and the shifting...by Ilan Rabinovitch
August 24, 2017
Operations teams are receiving more telemetry data from monitoring systems than ever before. But they are struggling to sift through this data to find what really matters – resulting in alert fatigue and missed alerts. For this reason, we’re proud to announce that long-time partner Event Enrichment HQ is joining the PagerDuty family to deliver the industry’s first integrated event management and incident resolution platform. Adding Event Enrichment HQ and its keystone product, the Event Enrichment Platform (EEP), to PagerDuty helps you quiet your noisy monitoring systems, reduce alert fatigue, and slash your incident resolution times.
As the toolset to monitor apps and infrastructure grows, teams can create a highly specialized and finely-tuned monitoring landscape. However, these systems can create chaos – noise drowns out the important notifications, and on-call becomes even more stressful as teams are continually interrupted by notifications that aren’t important. Missed alerts can cause unexpected downtime, and burnt out engineers may suffer from reduced productivity or they may leave entirely. It’s no secret that retaining top employees is a good financial idea, with studies showing that replacing an engineer can cost 150% of their salary.
With EEP, Operations teams can intelligently control incoming monitoring notifications. They can apply custom classification rules to filter out non-urgent events, dramatically reducing the number of events their team has to manage. EEP customers have already reduced noisy incidents by up to 94%. Those same classifications rules are used to route events with user-defined severities which create PagerDuty incidents and alert the appropriate on-call teams through PagerDuty’s schedules and escalation rules. The suppressed events are stored for reporting and future analysis.
If a quieter event pipeline wasn’t enough, PagerDuty can now tell on-call responders how to fix what’s wrong directly in incident notifications. Quick access to runbooks is a must for rapid incident resolution. Unfortunately, this information often lives in several places – knowledge bases, wikis, spreadsheets – which might be disorganized and just downright hard to find. In the middle of a critical incident, it’s not uncommon for responders to have to wait for teammates who have the information, which can significantly extend resolution times. EEP gives PagerDuty the power to include remediation steps and other contextual remediation information directly in the incident. Can you say “sliced resolution times”? Because we sure can. Tasty.
We’re launching the new integrated EEP + PagerDuty as a closed beta, starting today. To be included in the beta, sign up here to request access.