What the NFL Taught Us About Human and AI Coordination to Build Resilient Operations
As the world turned its attention to Super Bowl LX, PagerDuty joined Amazon Web Services (AWS) and the National Football League (NFL) for a timely...
5 min read
As the world turned its attention to Super Bowl LX, PagerDuty joined Amazon Web Services (AWS) and the National Football League (NFL) for a timely...
5 min read
One key takeaway from AWS re:Invent 2025 was that a clear gap has emerged between teams still experimenting with AI and those seeing measurable value...
9 min read
Today’s higher education institutions operate complex digital ecosystems that were unimaginable a decade ago. Behind every college lies a portal of interconnected systems for registration,...
3 min read
We didn’t try to build a clever agent. We built one that shows up pre‑armed. The lesson arrived earlier this year, as we began developing...
Modern systems generate enormous volumes of operational data. Yet, most incident workflows still treat every outage like a one‑off fire drill: an alert fires, responders...
4 min read
We are on the ground with AWS and announcing innovations that give customers more powerful AI agents for incident management. These new and improved integrations...
4 min read
Even the best site reliability engineers (SREs) spend too much time doing reactive work—triaging incidents, gathering context, escalating to the right teams, and documenting what...
6 min read
The energy at Microsoft Ignite this year was electric. AI was everywhere, and the possibilities are limitless. As developers and operations teams explore what AI...
Most operations teams are stuck in a reactive loop: Resolving incidents as they happen, then moving on to fight the next fire. This approach keeps...
4 min read
Having just returned from the 2025 EDUCAUSE Annual Conference in Nashville, I want to share some insights on the future of campus IT from the...
4 min read
The holidays amplify an inherent risk to businesses: lighter staffing, heavier traffic, and zero appetite for surprises. In addition to locking in your coverage crew...
5 min read
Across Europe, the cautious optimism business leaders held towards AI agents has evolved into more widespread enthusiasm. What was once a curiosity is now core...
5 min read
The AI SRE landscape has exploded over the past year, with vendors racing to add artificial intelligence capabilities to their platforms. For engineering leaders evaluating...
6 min read
Modern operations happen in Slack, where teams spend their days collaborating, troubleshooting, and resolving incidents. And while many incident management tools offer Slack-friendly experiences, they...
5 min read
An effective post-mortem can turn a security breach into a blueprint for lasting resilience. But too often, in the stress of an incident, documenting what...
5 min read
If you feel like your incidents are multiplying while your stack gets more complex by the week, you’re not alone. Event volumes keep climbing, signals...
7 min read
The engineer you pay $200,000 a year just spent an hour copy-pasting data between dashboards. Again. Software engineers have critical skills that are in the...
5 min read
The best way to minimize the impact of an incident is to catch it early, before small issues snowball into major disruptions. That requires maintaining...
4 min read