In The Hitchhiker’s Guide to the Galaxy, a group of scientist mice built a mega-computer named “Deep Thought” to Answer “The Ultimate Question of Life,...by Lisa Yang
December 6, 2018
Once every four years, an event of global proportions comes around. Once every four years, an event leaves people around the world crying tears of happiness or disappointment. Once every four years, as Gary Lineker said, we get to watch 22 men run back and forth, chasing a ball for 90 minutes, and in the end, the Germans always win (but not this year, to the disappointment of this Die Mannschaft fan).
Yes, I’m talking about soccer. Or football, as most of the world calls it. And as I watched some players getting injured (or flopping), I started thinking about how overall team health and communication are two major elements that can make or break a football team—and how that’s also true in the workplace.
When watching a match, I’m always in awe of how fast the players can run while keeping an eye on where to pass the ball to set up their teammates to score. I’m also in awe at their tip-top health and endurance. For people who love running and do marathons for “fun,” I’m sure 90 minutes of running back and forth is child’s play. But for someone like me who avoids all forms of cardio, it’s amazing they’re not all bent over double, gasping for breath after 30 minutes. Which brings me to my first point: In football, speed is key, but so is endurance and having a healthy team.
Duh, you’re probably thinking. Doesn’t take a genius to figure that one out. But then, how does this apply in the workplace and, more specifically, to ITOps and DevOps teams?
It’s all about knowing your teammates. For instance, when an issue happens, responders aim to fix it as quickly as possible to reduce impact on the business and customer experience. But in today’s always-on digital world, this means responders are expected to be on call 24/7 and working overtime when they shouldn’t be. A recent global study commissioned by PagerDuty shows that responders won’t (and don’t) endure such working conditions for a sustainable period of time.
Why does this matter? After all, it’s just one person, right? In fact, it matters a lot and, as I pointed out earlier, it’s how well you know your team. During the World Cup, it matters because you only get three substitutions per match. Once a team is out of subs, that’s it—everyone needs to keep playing regardless of how tired they feel, even if their performance is subpar. And if that person gets injured and has to leave the match, the remaining 10 players are stuck doing extra work to cover for that teammate.
In the world of IT Operations, the same idea applies: just one person burning out and quitting will affect the entire team’s health since everyone else will have to work harder to cover for the lost teammate until a replacement is found. Having insight into the health of each person on the team can be invaluable, especially when the cost of replacing one experienced IT responder can reach $300,000.
In football, teams study their opponents before each match to figure out starting formations. But what happens when an opponent doesn’t play as expected? Or in the world of IT, what happens when something breaks unexpectedly? One thing is for sure: You don’t want to penalize your team for missing the goal (or missing the root of the issue).
This is when the ability to change tactics in real time comes into play. On the pitch, the manager and team captain absorb all the action in real time, and give instructions on what needs to be changed, be it changing from a 3-3-4 to a 4-4-2 formation or deciding to just park the bus. The goal is to, well, score a goal or maintain the lead. When things don’t go as planned in IT, determining a plan of action belongs to the Incident Commander, who directs what needs to be done—be it shutting down a server or having someone run a response play— based on all the inputs provided by the team and what’s happening in real time, with the goal of having everything running smoothly as soon as possible.
Remember the infamous vuvuzelas at the 2010 World Cup? I do. Those of us watching at home had problems hearing the TV commentary, and I can’t even imagine how loud it must have been at the stadiums. Not only that, but can you imagine how difficult it must have been for the teams to communicate with each other on the pitch, which is vital when they had to change their strategy on-the-fly?
The same goes for operational noise. The systems and services that teams run are growing every year, and the number of alerts with them, which means real issues that need immediate attention can easily get buried in the flood of noise. Much like how broadcasters needed to figure out a way to filter out the sound of the vuvuzelas so commentators could be heard, responders also need a way to filter out operational noise so they can concentrate on issues that matter.