Why We Use On-Call Shadowing On-call shadowing is an essential practice at PagerDuty. For a new engineer, a shadowing period serves as a kinder, smoother...by Max Timchenko
March 26, 2019
It’s safe to say that the reasons and the benefits of migrating to the cloud are already well known. Organizations have come to embrace the fact that the cloud offers vastly improved agility and scalability, and are making moves to the cloud by embracing either a hybrid model or looking to move even critical workloads and infrastructure to the cloud.
But cloud migration is far from simple. Migrating to the cloud brings with it many challenges. These pitfalls can have a huge impact on your organization if not addressed early on, and it’s essential to future-proof your cloud strategy to adapt to change and growth.
In this post, we’ll go over the most common pitfalls associated with cloud migration and share actionable best practices on how to overcome them.
One of the biggest challenges of moving to the cloud is getting your people and processes to make the transition with you. It’s one thing to move your systems, services, and applications to the cloud, but it’s another to make sure your teams embrace a new agile process along the way.
As you modernize and transform your applications and infrastructure, it’s important to ensure that your teams are well equipped to take advantage of this newfound agility and flexibility. Modernize your teams and build an agile DevOps culture that embraces shared visibility, automation, ownership, and continuous learning. Automate the manual tasks of engaging the right people and orchestrating the appropriate response so that developers are empowered to own their code in production and learn from their mistakes with transparent and blameless postmortems in order to continuously improve.
A team that leverages the automation, agility, and shared visibility enabled by cloud environments, modern toolchains, and DevOps culture can focus on what matters — delivering awesome customer experiences — while minimizing the impact and duration of incidents. By adopting DevOps best practices that remove silos between developers and IT operations teams, teams set can set themselves up for cloud migration success. Tools may change, but people and processes are in it for the long haul.
Once you’ve got your people and processes in order, getting insight into team health is critical. Visibility into the health of your responders and team is crucial to retaining employees and minimizing the business impact of your incidents. In the long run, it’ll help you get determine what’s working and what’s not working, where most of your time is being spent, and just how much a major incident is costing you — both with respect to lost opportunities and wasted productivity. That’s why getting a better understanding of operational metrics such as MTTA and MTTR, as well as the operational health of your people, teams and services is essential to keeping your team happy and healthy, while ensuring your business is running smoothly.
An overworked team isn’t any help to anyone. Alert fatigue and overextended on-call shifts can tire out any team, leading to missed alerts and potential errors. By leveraging granular analytics to understand system and team efficiency, you can analyze trends and recognize not only where your gaps are, but also get insight into employee productivity or burnout.
Operational insights also allow you to measurably improve the health of your infrastructure, so you can minimize downtime, customer impact, and revenue loss.
The health and efficiency of your people and processes is critical, but let’s switch gears and talk about the other end of the spectrum — your tools. As you make your journey to the cloud, there will be new cloud-native applications you adopt along the way, some legacy applications you leave behind, and some situations that will require a hybrid model.
It’s important to make sure your monitoring systems, services, and tools — both new and old — can be integrated and visible within a single pane of glass. Drawing patterns across data, and applying centralized logic to get it in the hands of the right people is key to truly understanding the health of all your business and technical services, both on-premises and in the cloud. By collecting signals from any data source (like hybrid cloud environments, IoT devices, social media activity, and more) businesses can effectively harness valuable data that can then be automatically grouped and correlated. And when incidents become the container of all relevant machine and human context, teams benefit from significantly less noise, easier cross-functional handoffs, and far improved resolution times.
Getting visibility into all your systems requires that you have integrations with all your monitoring systems and tools, whether they’re in the cloud or on-premises.
It’s not enough to just have your monitoring systems connected, as this can result in a firehose of irrelevant, non-actionable notifications that’ll just overwhelm your teams. What’s more important is understanding service dependencies across your environment and correlating the data to intelligently surface the right information that’s been gathered from all the systems. That way, the right teams can get the right context to take appropriate action accordingly. By extracting signal from the noise, teams can immediately understand the business impact of issues. Moreover, by modeling services and business logic within a modern incident management platform like PagerDuty, instead of triaging events from siloed tools, teams can reduce downtime and determine root cause.
With complete service health views, teams can detect patterns and anomalies to proactively triage and even prevent critical issues. And if human intervention is required, responders can orchestrate the right response (ideally automatically, as every second of service degradation is incredibly costly to the business).
With customers expectations of their services on the rise, it’s more important than ever to position your organization for success to deliver amazing customer experiences. Migrating to the cloud allows organizations to be far more flexible and agile, and more easily address changing customer needs. The first step is making the move, but making sure teams can realize lasting benefits from the migration process by having the right operational framework in place is just as important. After all, less than 40% of organizations meet goals related to migration projects.
A successful cloud migration is hard, but it’s far from impossible. Prepare yourself to overcome common pitfalls and set yourself up for success before, during, and after migration so you can deliver great results for both your end users and technical teams.