This blog was co-authored by myself and Simon Darken. Once a year, PagerDuty’s SREs get together for a three-day, in-person offsite. With the team spread...by Dave Bresci
December 5, 2018
At the latest PagerDuty Connect event in Toronto, DevOps expert Arthur Maltson shared a recent story about chaperoning his daughter’s school field trip to a firehouse, where the alarm sounded right when all the kids had just climbed onto the fire truck.The amazing thing? Even though they were surrounded by 20 noisy, excited kids, the firefighters knew exactly what to do and didn’t waste any time—the kids moved off the truck and stood against the walls as instructed, and the engine was gone in minutes.
That example alone shows how operations teams can learn from emergency responders because, to some extent, DevOps and IT Operations teams are tasked with doing the same thing—overcoming the noise (digital signals in this case) to respond to alerts at the real-time pace consumers expect. However, enterprises can’t hire fast enough to keep up with the ever-increasing flow of signals across an ever-increasing range of digital services that consumers rely on. So companies are now looking for ways to reduce the roar of digital noise—including alerts that need immediate action, duplicate alerts, and non-actionable noise (it’s akin to having a firehouse bell constantly ringing!)—so responders can effectively triage and prioritize across P1 incidents that impact the business.
We’re seeing a growing movement to control digital noise, not just in the world of firefighters or DevOps, but also in other public domains. For example, Jack White, lead singer of the White Stripes, and a growing number of celebrities also seem to have learned from emergency services and are finding new ways to manage digital signals. One way they’re doing that is by creating phone-free spaces.
You heard me right: They’re banning phone use in certain places and for certain events.
Jack White, Donald Glover, and Dave Chappelle are among the celebrities who have found creative solutions to dealing with digital noise, such as partnering with Yondr, a company that creates phone-free spaces. With Yondr, when ticket holders arrive at a venue, they are given a Yondr case and instructed to put their phones into it, lock it, and put it in their pockets and purses. Attendees have their phones with them, but can’t take photos, videos, or selfies, or post on social media—the idea is that, with their phones locked away, people can focus on and enjoy the experience with fewer distractions. This desire to block or limit phone use has extended into many other spaces as well, including schools, courthouses, and wedding halls.
But while concert attendees can shut off their phones and lock them away with (usually) little or no consequences, PagerDuty knows that operations teams can’t just block out all signals—after all, those same concert attendees rely on their mobile devices every day for a number of different services. What operations teams can do, however, is limit the signals to only the alerts that matter and page only the people who need to act on those alerts. IT and DevOps teams who can shut off alerts when they aren’t on call (so they can sleep or enjoy the kids’ soccer games—“DadOps” as Maltson calls it) or limit the alerts to just the important ones are better positioned to quickly fix critical incidents without getting overburdened.
PagerDuty Event Intelligence helps organizations to do just that. Developed to help IT and DevOps teams cut through digital noise to fix what matters fast, this brand-new product uses machine learning algorithms that combine machine data with human context to automatically group related incidents, prioritize what’s actionable, surface remediation information to accelerate resolution, and adaptively learn based on real human behavior to continuously provide more accurate recommendations out-of-the-box.
The result? Digital noise is managed, the lives of operations teams are improved, and organizations benefit from the time saved—time that can be re-invested into employees, innovative projects, or other activities. Meanwhile, companies also benefit from increased employee retention and productivity, less business disruption and downtime, and faster time-to-market and innovation.
So we’ve talked a lot about what DevOps teams can learn from first responders and Jack White, but what about the best practices that the rest of employees in an enterprise can learn from DevOps? Can this notion of reducing noise, responding in real time, working healthier, and focusing on the signals that count help other teams, too?
Check out our blogs on how Marketing, Customer Support, Security teams, and more can adopt DevOps best practices and join the growing movement of people who are taking real-time action on what’s important—and at the right time rather than being on-call all the time—and elevating work to the outcomes that matter.