At Velocity Santa Clara this year, I gave a talk about the interplay between the DevOps movement and product management. I think it’s important for product managers to understand the impact their choices have on software operability, and for engineering teams — whether DevOps or other — to have an understanding of how their choices affect product.
I want to call out three practices we use in building, shipping, and operating software at PagerDuty, and how they help our product team ship more value to customers.
Fast, Frequent, and Stable Deploys
Most of our Github repos have continuous integration or continuous deployment setup. When you open a pull request, your code is magically slurped up to the cloud, a battery of tests is run on it, and if they all pass, you get a big green “safe to deploy” check mark.
We also have ChatOps deployment. There’s a channel in Slack where you can ask a bot to throw any branch of any repo on any environment, including production. This also makes it really easy to investigate and roll back bad deploys, and generally helps with visibility about what’s going on in our infrastructure.
Having experienced this magic, I cannot imagine going back to a world where development and deployment require taking the site down for maintenance, or going through a separate team. Product Managers live and die by the feedback loop — the longer you go without seeing live product and learning from it, the bigger the risk you incur.
Inspired by Netflix’s Simian Army, we regularly inject controlled and supervised failure into PagerDuty’s systems to ensure that we’re always ready for the unexpected.
Every Friday at 11am Pacific, development, operations, and anybody else who wants to gets in a room, and we take down a part of our infrastructure. The goal is threefold:
- Set an expectation that when you stand up a service at PagerDuty, you should create it with failure in mind. You should expect that at any time, some crazy stuff might happen, and you need to stay up.
- Identify places our systems aren’t robust in a controlled environment, so we can fix these issues way ahead of any customers being impacted.
- Practice our incident management techniques, so that when there is a black swan event, our response is practiced and automatic.
I love being a product manager in a culture that embraces failure as a learning opportunity. Product management involves a lot of risk and failure on the way to success, and having this embedded in our culture makes it easy to have cross-team conversations about these topics.
I’ve also learned a lot from the practice of engineered failure, blameless postmortems, and systems thinking. I’ve learned to look at both failure and success not as attributable to a single decision, or a single person, but as interactions of multiple beliefs and circumstances. It’s taught me to build a backlog of work that doesn’t require a waterfall-style domino reaction to be successful, but instead recognizes uncertainty and stacks the odds in our favor.
Build it, Ship it, Own it
My favorite part about DevOps at PagerDuty, though, is the culture of software ownership. Our teams deploy the code they write, and take responsibility for the software they put out into the world.
Sometimes that’s a little frustrating as a product manager, because it means that your roadmap of new features can be randomized by technical debt or operational issues. It’s more to keep in mind when planning work and running a team.
But in the long term, the culture of ownership helps product make not just correct choices for the project they’re currently working on, but correct long-term choices for their team and the greater organization.
With devops, your development budget and operations budget are the same budget. Your people are the same people. If you ship shaky code or push your team past their sustainable limits, its your own future velocity that gets hurt. In the words of Jez Humble, “Bad behavior arises when you abstract people away from the consequences of their actions.” I think that the tight relationship we have between product and engineering drives the right behavior and tradeoffs.
Getting product and engineering on the same page isn’t always easy. People have to learn to speak the same language, build shared context and understanding, and think about how different work streams affect a shared goal. It can be a little bumpy, and every release we look back and see places we could have done better.
But I think it’s worth it. Every time I see shrink the gap between choices and consequences, every time we’re able to give teams more more ownership of their code, we ship better software, more frequently, and make our customers happier. It can be a lot to get used to initially, but I can’t imagine a better model than devops for doing great product management.