In 1991, Packy Hyland Jr. convinced a Wisconsin bank it could save printing costs by storing reports on optical disks. That early innovation became OnBase, the now $5 billion global company’s flagship product—and set Hyland Software on its way to being a leading provider of data processing, storage, and management.
A universal enterprise information platform, OnBase centralizes business content in one secure location. It then delivers relevant information when and wherever it’s needed – increasing productivity, delivering excellent customer service, and reducing risk.
Serving over half of Fortune 100 companies, it’s critical for Hyland’s infrastructure team to ensure uptime of these cloud-based technologies, solutions and services.
The infrastructure team struggled to get actionable information to the right responders. “Prior to PagerDuty, we had multiple monitoring solutions that would deliver alerts in various ways,” explained Brian Long, Observability Engineer. “We had difficulty getting the correct information to the correct team, or alerts were delivered in fixed formats that didn’t necessarily give pertinent information front and center.”
For example, when the team needed to be notified about AWS instance retirements, alerts came in as a giant block of text with no formatting. The information wasn’t consumable and lacked details about which instance, the endpoint that was being retired, and what work needed to be done on it. Even experienced responders would need extra effort and time to dive in and understand the problem.
In addition, triage and cross-team escalations were inconsistent and at times ineffective, resulting in slow or clunky collaboration. “Many of the processes that worked during the normal workday schedule, such as reaching out to those teams through Slack, weren’t reliable if those teams were off hours, or if the response was handled by a 24/7 team that then needed to escalate to a non-24/7 team,” said Long.
Hyland needed to improve the user experience for engineers, as well as drive faster resolution for its nearly 20,000 customers.
The company turned to PagerDuty Event Orchestration, a feature set within the Event Intelligence portfolio. Event Orchestration uses custom logic and rules nesting to enrich and control routing, or to trigger webhook actions based on event conditions.
Event Orchestration cuts down on manual work by connecting real-time event processing with intelligent automation. “Event Orchestration allows us to set multiple service delivery rules to classify if a payload comes in with certain detailed information,” Long shared. Because Event Orchestration processes rules “top down,” the team puts specific and strict rules toward the top, and more generic rules toward the bottom as catch-all functionalities.
Event Orchestration helped Hyland address the issue of poorly formatted alerts like AWS instance retirements. Based on the metadata, the alert is intelligently delivered to the correct service. By adding Transformations and defining Custom Variables, difficult machine terms and code are translated into helpful context for responders to effectively respond to the problem. “Using custom variables, we are able to write pieces of text that make the alert information more human and easier to understand,” Long explained. “Now we know it’s an AWS instance retirement, what account it’s on, and the instance or machine that requires action. The alert responder can then quickly mobilize, identify any additional pieces of information that don’t get sent as part of the payload, and resolve the issue much faster.”
Hyland also leveraged PagerDuty to assemble and mobilize cross-functional teams, escalating to additional subject matter experts when assistance is needed and further speeding up resolution times. Using Response Plays, incident actions can be run at the push of a button, which escalate directly to the appropriate team based on the pre-configured escalation policies inside of PagerDuty. The name of each Response Play is actionable, so the user knows exactly what will happen by clicking it. “All actions are tracked on the incident so the person reaching out knows what is going on,” Long said.
PagerDuty has made a significant impact on Hyland’s infrastructure team, helping to ensure an always-on cloud environment for customers. The team has seen improvements that include:
“When we looked at our problems, we saw that we had alerts that potentially needed to go to different teams, the alerts were poorly formatted, and we had hurdles and issues reaching out to other teams,” Long said. “PagerDuty solved all of that for us.”
Watch Brian’s Summit ‘22 Session—Intelligent Delivery and SME Mobilization: Ensuring Effective Alert Distribution and Resolution.