PagerDuty’s Ops Guides Get a Fresh New Look
The Community and Advocacy Team here at PagerDuty recently spruced up our library of ops guides, and we’re excited to share them with you.
If you’re not familiar with the ops guides, they are an open-sourced collection of long-form documents that cover a variety of topics related to real-time operations and incident management. We’ve given them some spiffy new headers, cleaned up some sneaky errors, and added a new section titled “Next Steps.”
While the main content of the ops guides is tool agnostic, we wanted to give you some hints on how to apply the ideas in the guides to your team’s PagerDuty setup. We’ve added the “Next Steps” sections to a number of the guides to give you some concrete examples of how the topics discussed in the guide would work in practice—so now you can read the theory and then see it in action.
Tour the Ops Guides
The ops guides span a wide variety of topics and relate to some of the technical and cultural aspects of running modern technical systems. Let’s briefly walk through some of the different guides available:
- Best Practices for On Call Teams is our newest guide. Folks asked us how to approach on call for teams new to being part of on-call rotations. We collected some tips and practices and put them all here for you, from equipment checklists to handling on call in a humane way for your team.
- Our guide on Automated Remediation will give you some ideas on what to focus on if your team is drowning in repetitive tasks and spending too much time on toil.
- Not sure where to begin? Full-Service Ownership lays out some of the new expectations for technical teams in a world that is increasingly global and always available.
- Incident Response lays out PagerDuty’s framework for creating strong incident response practices with your team, including roles like Incident Commanders. This guide also includes a walkthrough of our Incident Commander Training.
- Internal Stakeholder Communications describes an approach to keeping everyone updated during an incident. Make your plan before anything happens so you won’t get stuck in the moment.
- Looking for some guidance on applying metrics to your digital business? Take a look at Operational Reviews.
- Your team should be holding Postmortems when things go wrong. How you learn from those incidents is important.
- Our Agile Leadership Team contributed to the Retrospectives guide. Give this one a read if you’re looking for some guidance around holding regular review meetings for your projects (not just when something goes wrong).
- Last, but not least, is PagerDuty’s internal Security Training. Additionally, take a look at our approach to topics spanning from social engineering to compliance.
We’d love to hear how you’re using these guides to help your teams! Let us know in the Community Forums or drop us an email at email@example.com. If you have any suggestions for improvements or new topics that you’d find useful, let us know that too.