Preparedness as a Competitive Advantage: Building Resilience Year Round
The recent global IT outage is a stark reminder that even the most advanced organizations can have bad days. Major disruptions can have significant downstream...
5 min read
The recent global IT outage is a stark reminder that even the most advanced organizations can have bad days. Major disruptions can have significant downstream...
5 min read
One of the first key tenets of cloud computing was that “you own your own availability”, the idea being that the public cloud providers were...
Software is not perfect. And ultimately, it’s not a matter of if you will have an outage, but of when. With the increasing complexity and...
4 min read
Incidents can happen anywhere at any time. They can be small, well-defined, and easily contained. They can be large, messy, and complex, like the major...
5 min read
While they are untimely, stressful and likely to highlight communication breakdowns within an organization; incidents can be a powerful tool for learning and growth in...
4 min read
Incidents impacting your customer and user-facing services can be stressful, both for the responders on your team who are working on a resolution, and for...
Mitigating business risk is a key enterprise priority. To avoid unnecessary exposure to the business, technical teams need a proactive approach to managing incidents. While...
4 min read
Organizations looking to win the market and drive great customer experiences need to deliver on the promise of exceptional service, meaning fewer interruptions and faster...
6 min read
As we reach the end of our blog series on the occurrences in 2023 from the fourth installment of our blog series, Restore: Repair vs....
6 min read
We live in an always-on world, where things move fast and break often. Building stronger resilience is critical for operational efficiency and delivering great customer...
5 min read
Improving Beyond MTTR We’ve posted a bit about the ambiguity around MTTR before, but we want to get deeper into the confusion and maybe false...
8 min read
Data has become the lifeblood of businesses, empowering organizations to make more informed decisions, drive innovation, and gain a competitive edge. McKinsey touts the benefits...
7 min read
The more that automation can remove toil and take care of rote tasks in incident response, the faster teams can focus on problem identification and...
In order to respond in real-time to urgent, critical digital incidents, on-call responders must be able to take action from anywhere. But when on-call responders...
If there’s one essential thing we’ve learned from being in the business of digital operations for more than 13 years, it’s that every business has...
Co-authored by Chris Bonnell, PagerDuty Data Scientist VI Hello and welcome to the fourth post in our EI Architecture series focusing on Intelligent Alert Grouping....
6 min read
In The Hitchhiker’s Guide to the Galaxy, a group of scientist mice built a mega-computer named “Deep Thought” to Answer “The Ultimate Question of Life,...
5 min read
For many of our customers, reducing alert noise is a difficult, yet rewarding task. Cleaning up your alerting means fewer late night pages and happier...