PagerDuty Blog

The IoT and Next-Generation Monitoring Challenges

We’re living in the future.

Thanks to the Internet of Things (IoT), our world is more automated and connected than ever before. Just about everything, from cars to refrigerators, to coffee machines, has the ability to connect to the Internet, allowing us to micromanage each individual device. This type of automation is an exciting direction for the world to be moving in and the benefits it brings to businesses are huge.

Optimizing Processes

A great example of a traditionally non-networked company using IoT to make smarter decisions is Rolls-Royce. By integrating networked sensors into their aircraft engines, Rolls-Royce can send engine data back to monitoring stations on the ground and stay ahead of the curve on everything from performance to malfunctions. In a nutshell, they’ve created their own version of the Automatic driving sensor for airplane engines.

Photo: businessinsider.com

This type of IoT monitoring didn’t start in the air for Rolls-Royce, though. They’ve integrated sensors into the manufacturing of fan blades in order to better automate measurement schemes and monitor quality control processes. The amount of manufacturing data generated by these monitors is upwards of a half of a terabyte per fan blade, which gives Rolls-Royce a more detailed view of their manufacturing process than ever thought possible.

By automating and integrating the Internet of Things into their workflows, businesses are more capable than ever of optimizing their processes in real time. Unfortunately, despite the huge benefits of IoT, the depth of some of these integrations makes companies more dependent on infrastructure than ever before.

A recent example of how the Internet of Things can go wrong is the DDoS attack initiated by an IoT botnet that took down the DNS provider Dyn. What happened in this attack? Networked IoT devices (think security cameras, DVRs, thermostats, etc.) were infected with malware, turning each device into a member of a botnet. This botnet was then used to direct a massive DDoS attack at Dyn, resulting in disruption of services for countless cloud services that developer and ops teams rely on, including PagerDuty, GitHub, Heroku, and Amazon Web Services.

While an attack like this may not affect Rolls-Royce’s monitoring solutions (they’ve built a system that can theoretically operate without the Internet at all), the same can’t be said for the countless organizations that have built up a dependency on their connectivity. This type of bottleneck can mean the difference between coordinating warehouse and shipping operations across multiple locations and everything grinding to a halt.

So, how do we balance the efficiency boost that IoT provides, against the connectivity bottleneck of IoT-enhanced business operations?

While the obvious answer to this question is to establish redundancy to reduce risk, in organizations that are relatively new to a connected infrastructure, it is important to establish a proper incident management workflow to efficiently respond to incidents. Connected devices are like employees, in a way. Their productivity needs to be monitored and managed in order to resolve issues before they become big problems.

A documented, established process is crucial to reducing the diagnostic and repair time of most issues, which will in-turn reduce the risks associated with new technology. This process can be created in many different ways, but the fastest way to get intelligent workflows in place is through third-party services like PagerDuty.

Incident Management for IoT

Because one of the primary uses of IoT in the enterprise is to analyze metrics, identifying incidents early is the same as identifying outliers early. Whether there is an issue with the devices themselves or the metrics they are gathering, it is immediately apparent that something is wrong. It is important to remember, though, that data is only as useful as how it is used, so setting up an incident management system with full-stack visibility is key to staying ahead of problems.

On the flip side, IoT devices that do more than simply spit out data are significantly more critical to the operation of a business, which means that their health should be monitored very closely. At a high level, the act of monitoring these devices is the same as for others, but because metrics might not be their core functionality, the actual method of monitoring will likely be more involved.

While the Internet of Things can take a business to the next level, it also brings with it new challenges that should be planned for accordingly. In the end, the key to effective IoT incident management is visibility for immediate response orchestration. After all, it’s hard to manage incidents you know nothing about.