| In DevOps, PagerDuty Life, Tech Talk

On June 28th, 2017, we marked four years of performing “Failure Fridays” at PagerDuty.  As a quick recap, Failure Fridays are a practice we conduct weekly at PagerDuty to inject faults into our production environment in a controlled way, and without customer impact. They’ve been foundational for us to verify our resiliency engineering efforts. Over […]

| In DevOps, PagerDuty Life, Reliability

“Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production.” — Principles of Chaos Engineering Netflix, Dropbox, and Twilio are all examples of companies that perform this kind of engineering. It’s essential to have confidence in large, robust, distributed […]