PagerDuty Blog

Monitoring in the Microservices Age

Photo by photohome_ukManaging Increased Complexity Against Greater Agility

Thanks to Docker and the DevOps revolution, microservices have emerged as the new way to build and deploy applications — and there are plenty of great reasons to embrace the microservices trend.

If you are going to adopt microservices, you also have to understand that microservice architectures have many moving parts. When it comes to incident management, this presents an important difference between microservices and monolithic architectures. More moving parts mean more complexity to monitor and manage in order to keep applications and infrastructure healthy and running.

Let’s take a look at why microservices increase IT monitoring challenges and explore how organizations can handle the added complexity.

Defining Microservices

A microservice is a small application component that, when combined with other microservices, forms a complete application. If you deploy an app using Docker, it is likely composed of multiple containers. Each container represents a distinct microservice.

Microservices have become popular over the last several years as the DevOps movement has encouraged continuous software delivery processes. An application that is deployed as microservices is easier to manage, because admins can trace issues to particular microservices within the application. It is also easier to update, since an update to a particular component of the application requires admins to restart only that microservice, rather than the entire app. For these reasons and more, microservices help to facilitate continuous delivery.

The introduction of Docker in 2013 helped fuel interest in microservices. Docker containers provide convenient building blocks for microservices and enable an easy migration path for organizations seeking to port legacy applications (designed according to a monolithic architecture) over to a microservices model.

Complexity: The Microservices Trade-Off

Organizations adopting microservices need to consider the additional complexity that they add to infrastructure. When a monolithic application is transformed into a microservices application, it introduces more moving parts for admins to monitor and manage.

For example, consider a monolithic web app whose front-end and database run as a single application on a virtual server dedicated solely to that application. Monitoring this application is relatively simple. When one part of it goes down, the entire app goes down. There is only a single host to monitor and a single alert to contend with. To be sure, you could take a more nuanced approach to monitoring an app like this, if desired. You could monitor connections across different ports, for example, or monitor the server and database processes distinctly. Even with this approach though, the number of moving parts you have to monitor would be relatively small.

Now, consider the same app deployed as a set of containers. Instead of a single virtual server with the application running on it as a simple process, you’ll have the front-end and database layers running as different processes. Docker will spawn dozens, or, in a scale-out deployment, perhaps even hundreds of containers to support each of those processes. The number of containers would change constantly in response to application demand. In addition, you might have other containers in the mix devoted to tasks like collecting statistics about your application. To ensure application availability and performance, you would have to monitor all of these components — not to mention the Docker daemon itself. That’s a lot more complex.

To be clear, I do not mean at all to suggest that microservices are a bad idea. In the example above, the microservices-based version of the web application will be much more scalable and agile than the monolithic version. This additional agility is well worth the extra monitoring effort.

How to Monitor Microservices Effectively

An effective microservices monitoring strategy requires attention to two facts.

  • The first and most obvious is that microservices mean there are more components to monitor. This is not a hugely challenging issue to contend with; you simply need to ensure that your incident management platform is robust enough to handle a large number of alerts, assist you in triaging them, route them to the right people, and so on. Additionally, as microservices introduce a much higher volume of alerts, your incident management platform should also reduce alert noise when possible. Non-actionable alerts should be suppressed, while related alerts can be grouped or correlated into issues that require a response.
  • The second, more complicated fact to bear in mind is that by increasing complexity, microservices also increase the amount of information that admins have available to them to help manage incidents — and that’s a good thing. While having more components to monitor means that there is more data to contend with, that extra data can be leveraged to pinpoint problems. An alert related to a monolithic app simply tells you that there is something wrong with that app somewhere, and it’s up to you to figure out exactly what the issue is. With microservices, however, an alert related to an individual Docker container allows admins to hone in on the exact microservice within the app that caused the incident. They can then resolve the incident on that container without disrupting the other containers on which the app relies.

Microservices create both a challenge and an opportunity for incident management teams. They make infrastructure more complicated, but they can facilitate more effective and targeted incident response. The key to monitoring microservices effectively is understanding the differences between monolithic and microservices monitoring, and having microservices-ready incident management solutions and workflows in place.