Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Check out the latest capabilities we released.
Flexible schedules, escalations, & alerting
Automated, best practice incident response
Powerful context & noise reduction at scale
Quantify real-time business & technical impact
Improve with modern, prescriptive insights
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
We've created a maturity model to assist on the journey to digital operations excellence. Take our short assessment to find out where your team falls!
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
In part 2 of our postmortem series, we dig into how to establish a culture of continuous learning, from getting leadership on board to invoking...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
Smart devices require smart monitoring. That’s not a platitude. It’s an imperative. In fact, the smarter the device, the smarter you need to be about monitoring it.
As headlines have shown, unmonitored, unprotected smart devices may be a disaster (or a DDoS attack) just waiting to happen. Consider the following:
Last year’s wave of DDoS attacks was a wake-up call. Many smart devices have little or no built-in security, and that combined with wireless communication and the sophisticated features of their built-in operating systems, makes them particularly tempting targets for an attack.
It’s worth noting that the DDoS attacks simply made use of smart devices as nodes in a botnet, without specifically exploiting their hardware control or monitoring capabilities. It is not difficult to imagine a more targeted attack aimed at specific classes of smart devices and designed to make use of those hardware control capabilities, possibly with catastrophic consequences.
Even an attack on a single device could cause considerable damage, depending on its function. Given the potential danger and lack of built-in security, adequate monitoring is a necessity for detecting and preventing future attacks.
Building so much intelligence into smart devices gives them the capability to handle tasks that would be too complicated for traditional, non-smart devices. A smart device may control complex mechanical actions, balance power loads, or adjust environmental conditions based on sophisticated algorithms processing inputs from a variety of sensors. The more complex a system is, the greater the chances of significant errors. Monitoring smart devices serves as a safeguard against such errors.
It’s one thing if a smart toaster chars your English muffin, but when a smart medical device is controlling a suite of life-support systems, there’s no margin for error.
Smart devices may run expensive factory-floor equipment, traffic lights, power production and distribution facilities, as well as other important (and sometimes crucial) resources. Even a smart system for monitoring and balancing home electrical power may cause significant problems and economic losses if it fails. And when the failure of a system could endanger life or public safety, a major malfunction is not an acceptable option. It would be irresponsible and dangerous not to consistently monitor smart devices which control absolutely crucial functions, and that negligence could lead to significant legal repercussions or worse.
What is the best approach to monitoring a smart device? To a considerable degree, that depends on the device itself — what it does, what kind of built-in or bundled control and monitoring software it includes, and what kind of monitoring data it makes available. These are the key points to keep in mind:
A device may come with software that includes monitoring functions, or it may simply produce raw data that can be used for monitoring. Does it have an API? If it is compliant with the Open Connectivity Foundation (OCF)’s specifications for smart devices, it should include well-defined methods for querying and monitoring the state of the system. If it has a proprietary API, what kind of monitoring functions are included? Even without any kind of real API, however, it may be possible to extract some kind of monitoring output.
What monitoring information is available, and what information is important to you?
A complex OCF-compliant device may be able to provide you with a wide range of data about its operational state. When you have an abundance of monitoring data, to optimize response you need to decide what information should be monitored on an ongoing basis (to detect malfunctions and generate alerts, for example), and what information can simply be logged.
A simple non-compliant device, on the other hand, may produce nothing more than a cryptic (and non-specific) error code when it detects a failure or an out-of-range value. In that case, you simply have to take what you can get, and make the best of it.
Even if you have decided what information needs to be included during the specification/design stages of setting up your monitoring system, you may still need to filter out background noise (transient, slightly-out-of-range values, for example) on an ongoing basis, so that you can more clearly identify conditions that require an alert.
Alert noise is more than just an annoyance. If there is too much of it, it may mask signals that do require attention. Worse, it can even cause alert fatigue in response teams, so that they fail to recognize alerts that do require immediate attention.
Like filtering; analysis, sorting, and dispatching are all key elements of any issue resolution system. If an alert produced by a smart device is important enough to require a response, it should be correlated with other related symptoms (to reduce responder noise), automatically routed to the appropriate responders, and embedded with relevant or troubleshooting information to streamline the response.
Registering an alert in your system is only the beginning of the resolution process. Without effective, real-time information consolidation and dispatching, even the highest-urgency alerts may be in danger of getting lost.
Smart monitoring for smart devices isn’t an option — it’s a necessity. Knowing this, how can you get started in implementing the right solution? The key to monitoring smart devices lies in making optimal use of the data that each device provides. In the case of complex, OCF-compliant devices, this could mean sending monitoring data to customized control software in order to automatically respond to out-of-bounds conditions. The control software could then decide whether or not to generate an alert based on the device’s response to the adjustments made by the control software.
In effect, this adds an extra layer of smartness to the device and the monitoring system, taking advantage of the rich set of features included in the OCF specifications. Even in the case of a relatively simple, non-compliant device, it’s possible to make use of the data that’s provided to set up genuinely smart monitoring and produce a genuinely smart response. In a world of increasingly more unknowns, implementing such a solution can help your teams minimize the potential financial and security repercussions of smart devices gone rogue.
Ready to take your monitoring to the next level? Sign up for a free PagerDuty trial today!
“I need to be notified if there’s a significant event ongoing with SignalFx.” This is what I tell my team. However, despite being the CTO...
This is a guest post by Ilan Rabinovitch, Director of Product Management at Datadog. The convergence of rapid feature development, automation, continuous delivery, and the shifting...
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2019