Upskilling your Network Operations Center

by Hannah Culver May 1, 2024 | 4 min read

Many organizations are heavily investing in AI and automation to remove the burden of manual work and operational efficiency. However to drive their wide scale adoption, they also need employees who can collaborate effectively with the technology. To bridge that gap, companies can use upskilling to retain talent, mitigate risks to the business, and allow employees to grow their careers.

One place that IT leaders have been eliminating toil from is the Network Operation Center (NOC). With eyes-on-glass and catch-and-dispatch modes of operation proving less-than-ideal for this fast-paced distributed environment, it presents opportunities to free up NOC engineers’ time for higher value work and to repurpose it to leverage their skill sets. Let’s talk about why upskilling is good for both employees and the business.

To make the NOC more efficient, leaders should free up and repurpose engineers’ time in ways that lend to their current skill sets. There are three excellent ways to do just that.

From eyes-on-glass to noise reduction at scale: As implied, eyes-on-glass methods involve NOC engineers sitting and watching monitors to look for abnormalities in the way the system functions. Despite advances in monitoring and incident detection, many organizations run this way. That’s because the sheer volume of data incoming has reached a point where it’s impossible to simply trust that monitoring will surface the right information all the time. Unfortunately, it’s also impossible to watch everything monitoring sends to the NOC. In this space between a rock and a hard place, the best way you can free up NOC capacity is to reduce noise at scale. This includes silencing noisy, informational, and inactionable alerts by auto-pausing notifications for them. Machine learning (ML) helps NOCs separate the signal from the noise immediately and eliminates the need to sit and watch screens all day.

From catch-and-dispatch to event-driven automation: Another common NOC method is catch-and-dispatch escalations. Basically, issues come into the NOC, an engineer looks at it and decides what team it should go to, then routes the incident. However, this is highly manual, prone to error, and a waste of NOC time as well as sometimes the time of the SME receiving the escalation.

With event-driven automation, NOCs can reduce the sheer number of incidents that need human attention via auto-remediation. For other incidents that do require humans in the loop, event-driven automation can normalize and enrich event data so it’s easily consumable by all responders and run diagnostics immediately. This can help NOCs (and SMEs if necessary) resolve issues quickly with fewer escalations.

With this spare time, NOC engineers can be trained on more value-add initiatives. Depending on the engineer’s particular talent and aspirations, there are two common paths: development and automation. Some engineers can easily pivot to development work and help drive innovation. For others that want more of an operations-centric position, upskilling their automation abilities can work wonders for your operational efficiency. Some organizations even pivot NOCs to become Automation Centers of Excellence where the NOC builds, scales, and maintains automation on a global scale across the organization. Not only does this make the NOC more efficient, this makes all other teams more efficient, too.

From block-and-tackle to self-service IT: For some organizations, NOCs also handle issues that aren’t necessarily incidents, such as IT helpdesk tickets. However, this combined workload just contributes to the noise in the NOC and slows down organizational processes. For many of these issues, such as VM deployment, device restarts, authentication privileges, and more, automation can take the lead.

SRE teams can build and scale these processes for NOCs and reduce the manual workload. The biggest benefit of this? Beyond the time and capacity saved for NOCs, it also dramatically reduces the amount of time it takes to complete these actions. Previously, these actions may pass hands within an organization 2+ times, and each person will have their own prioritization, queue, and SLAs. Meaning, simple tasks could wait days just because it’s fallen to the bottom of someone’s to-do list. But with automation, this idle time is completely eliminated. All that’s left is the processing time it takes for the automation to actually run.

Upskilling NOC engineers can lead to happier teams and customers, increased innovation, and more efficient operations as a whole. If you’re looking to make your NOC more efficient, PagerDuty can help. Learn more by watching this executive fireside chat with PagerDuty Field CTO Heath Newburn and IDC analyst Nancy Nancy Gohring.