Continuous learning
How to learn from past incidents
Good vs better vs best practices for learning from past incidents in PagerDuty.
When Does This Matter/Problem Scenario
Every incident is an opportunity to learn and improve on systems, processes, or infrastructure.
Why You Should Care
Learning from past incidents helps teams identify root causes, strengthen processes, and prevent repeat failures, improving reliability over time. It also builds a culture of continuous improvement and shared knowledge, so everyone responds faster and smarter when the next incident occurs.
PagerDuty Practices
There are several approaches to learning from past incidents, ranging from simple documentation to comprehensive post-incident reviews and AI-assisted learning.
Description of Practices
Good
Add resolution notes to incidents to help future responders know what actions you took to resolve a similar incident in the past.
Better
Use PagerDuty Analytics (and/or PagerDuty’s Insights Agent) to assess incidents from the past week to feed weekly on-call handoff reviews. Use incident workflows to log post-incident action items in a ticketing system.
Best
For major incidents, conduct a full post-incident review and link post-incident action items to the review. For day-to-day incidents, teach the SRE Agent what you did to troubleshoot and resolve the incident so that it can share those learnings with the next on-call.