Blog

Lessons from the June 12 Outage: Your Operations Are Only as Reliable as Your Incident Management Platform

by Cristina Dias June 26, 2025 | 5 min read

As digital operations grow increasingly more complex, resilience is no longer optional, it’s essential. The next major outage isn’t a question of if, but when. And when it hits, the gap between true enterprise platforms and brittle point tools will become impossible to ignore. 

During the June 12th global digital disruption, many of us saw core services collapse and redundancy plans spring into action to maintain business continuity. With the global outage, even some incident management tools went dark, crashing alongside the very systems they were built to protect. This put a spotlight on the criticality of platform reliability: what happens if your platform to coordinate and mobilize time-sensitive, mission-critical operations goes down with the ship? Do you have a backup plan? And what does that cost your business? 

The PagerDuty platform didn’t just stay online, we led from the front. Our platform remained rock-solid, handling a 172% surge in incident volume and a staggering 433% spike in notifications without missing a beat.

This wasn’t just another outage. It was a stress test, one that drew a hard line between industry leaders and everyone else. We’ve been through this before. And we stayed reliable, just like we always do.

When reliability claims meet reality

If you can’t trust your incident management platform to stay online, what exactly are you even paying for? 

The June 12 outage proved that in digital operations, impact isn’t just about how long an incident lasts, it’s how fast and effectively you respond. According to PagerDuty platform data, organizations that were more operationally mature recovered quicker and experienced 27% less business impact than their peers.

PagerDuty data found that operationally mature customers both responded more quickly and efficiently to the outage, with mean time to acknowledge (MTTA) up to 31% faster. By leaning more heavily into the PagerDuty Operations Cloud, teams were able to get to resolution more than 52% faster than their peers, empowering them to quickly return to their normal course of work. This translates to millions in potential savings from just one incident.

And though downtime may be measured in minutes, its consequences are measured in trust and long-term damage alongside revenue. The average cost of IT downtime keeps rising. But the cost of choosing the wrong incident management platform? That’s immeasurable, with a direct connection to your ability to serve your customers. The difference between market leaders and laggards often comes down to one thing: reliability when everything’s on the line.

Because when your incident management tool goes down with everything else, how do you coordinate a response? When your “reliable” platform fails, how do you maintain customer trust? These aren’t hypothetical anymore. They’re this month’s headlines. And the platforms promising always-on operations went dark—exactly when their customers needed them most.

Reliability by design, not by chance

For over a decade, we’ve taken a fundamentally different approach to reliability. While others chase features and shiny interfaces, we’ve invested in the foundation that keeps your business running, even when the internet breaks.

During the peak of the June 12 crisis, PagerDuty delivered a median notification delivery time of just 12 seconds while handling a 172% surge in incident volume and a 433% spike in notifications. That kind of performance isn’t luck, it’s PagerDuty architecture:

  • Zero scheduled maintenance windows.
  • 99.9% web availability and notification delivery SLAs.
  • Battle-tested infrastructure that handles notification spikes without flinching.
  • Multi-channel communication options that keep working when others may be down.

And it’s not just the infrastructure. That reliability powers everything built on top, especially our automation and intelligence. Even during major outages, PagerDuty continues to operate at full strength because it’s designed to: 

  • Continuously process millions of signals to maintain operational awareness.
  • Adapt response patterns that learn from past incidents and know when to involve humans.
  • Orchestrate responses, route alerts, and manage workflows while some other tools can’t.

When reliability, intelligence, and automation work as one, you get more than peace of mind, you get performance you can count on.

Choose proven performance over empty promises

When it comes to time-sensitive, mission-critical operations, you can’t afford an incident management platform that collapses under pressure. When things go sideways, your platform should be rock-solid during the storm, not another system to worry about.

PagerDuty is built for moments like this. And we’re not standing still. Earlier this year, we expanded our platform capabilities and rolled out advanced features across all paid plans, so every customer gets access to enterprise-grade incident management that nearly two-thirds of the Fortune 100 rely on.

Here’s what you get with PagerDuty:

  • Industry-leading platform reliability with 99.9% web availability and notification delivery SLAs.
  • 15+ years of enterprise experience solving real operational challenges.
  • A platform that pairs true reliability with AI-powered automation and human-centric workflows.
  • Modern end-to-end incident management with External Status Pages, Post-Incident Reviews, and a chat-first experience.
  • An open ecosystem with 700+ integrations and stable, trusted APIs.
  • Built to work where you work – whether that’s Slack, Microsoft Teams, or our web interface.
  • Automation that handles common incidents without human intervention.

Want to see what real enterprise reliability looks like? Take this tour to learn more. We’ll help you prepare for what’s next and turn every challenge into an opportunity to get stronger. Start a free trial today.