PagerDuty Blog

PagerDuty: We Are Always On

With the rapid spread of COVID-19, many companies are shifting to an entirely remote workforce. During this time, being online and available to customers, vendors, and partners is more important than ever for businesses. At PagerDuty, while our primary focus is on the health and safety of our employees, their families, and the broader communities we are part of, one of our other top priorities is our commitment to our customers, especially in difficult times like these.

As you may have heard, all of our employees globally are now working remotely and have stopped all business travel. This may be the new normal for now, but working remotely is not new for us—we were built for this scenario from the very beginning.

Our employees are highly distributed around the world and are used to developing for and operating our platform in a distributed and remote environment. That means that despite this change, we’re able to keep PagerDuty up and running so our customers’ digital businesses stay up and running 24/7.

Our Commitment to Our Customers

As the market leader in digital operations management, we provide the largest, most reliable, and resilient platform offering in our space. Our customers rely on us help them orchestrate appropriate responses in real time when when their systems are having trouble—at any time, day or night. How?

Similar to our team members being distributed, our platform architecture is distributed as well. We are deployed across geographically separate cloud regions, which comprise multiple physical data centers. Our architecture expects surges in traffic from our customers. Unlike the hospitality or e-commerce industries, for example, we don’t have the benefit of seasonality when it comes to predictable traffic patterns. In order to be best prepared for unexpected increases in traffic volume from our 12,700+ customers, we are prepared to dynamically scale as needed.

We have been known for practicing our ability to stay reliable and resilient for you with our chaos engineering practices in our “Failure Fridays” series. We have built up our confidence in simulating failure scenarios over time, to the point where we now have Failure Anydays. Yes, any time of any day, one (or more) of our teams may be injecting controlled failure tests to quickly identify and mitigate issues that could impact our quality of service offering for you. This investment into learning from failure is not new for us as we’ve been sharing our process and practices with you since 2013. We are confident that we have the right pieces in place for our platform architecture, best practices, and the team that will continue to work hard and diligently to uphold our commitments to our customers.

Speaking of downtime, we don’t have scheduled downtime at PagerDuty. Your clock never stops, so why should ours? Our Service Level Agreements (SLAs) cover both availability and the performance of our offerings to our customers—again without carving out time for ourselves to conduct scheduled downtime for maintenance of any kind. We have built redundancy into our platform offering so that customers will receive notifications within the set delivery period when issues arise.

Many companies are making or have made the shift to remote work, and now, more than ever, it’s vital that their digital businesses are up and running. And PagerDuty is here to help.