Jeppesen delivers transformative information and optimization solutions to improve the efficiency of air operations around the globe. As the company started to grow and expand, the search for a solution that could aggregate all the alerts across their infrastructure, scale within the company, and escalate critical incidents became a priority. Pablo Castillo, Service Manager at Jeppesen, and his team members took the initiative to find a solution that would enable better operational agility for their IT teams and reliability for their environment.
Overcoming challenges around on-call and incident management
Jeppesen didn’t have a solution that supported on-call automation or incident management. A customer would call in to notify the company of an issue which would then fire off an alert. Additionally, their internally phone system required manually updating the on-call contact information, as such, on-call staff was never up to date so it wasn’t a reliable source to address a problem. At times calls would get forwarded to the wrong person. “There wasn’t any proactive detection of problems or incidents. As we got bigger, implementing a solution to manage this became a requirement,” said Castillo. As Jeppesen continued to expand the company and their customer base, the performance and availability of their applications became increasingly critical. With so many moving parts, getting an incident management solution in place to effectively manage their digital operations was top of mind for Castillo, his team, and the company as a whole.
Exceeding SLA expectations and decreasing downtime
Jeppesen selected PagerDuty to overcome the challenges they faced around incident management, on-call automation, and incident triage and escalation. Since implementing PagerDuty, the company has gained full-stack visibility of critical applications, aggregate and manage alerts across their infrastructure, prioritize critical incidents requiring immediate response, and stop business-impacting situations. “PagerDuty provides us with a clear timeline as to when the problem started, when it was acknowledged, and when it’s been resolved,” stated Castillo.
To deliver a faster response, with the help of PagerDuty Jeppesen implemented a ChatOps support model. Using the PagerDuty and Slack bi-directional workflow extension. With the click of a button, the Jeppesen team can acknowledge and resolve PagerDuty incidents from Slack. PagerDuty updates the Slack timeline, so it is always actively working on the issue, as well as when and what actions were taken. This also enables seamless collaboration and resolution on mobile.
Jeppesen has different SLAs that are tied to specific applications — the most important and impactful being the ones related to tracking. One of the SLAs is that Jeppesen can’t have more than 15 to 30 minutes of downtime per month. In the event of downtime, they need to act quickly. “We have 100% delivery for every product thanks to the support of PagerDuty. We got a call from PagerDuty when one of our website applications went down. When we received notice from our customer, the problem was already resolved. You look good to a customer when something like this happens,” said Castillo.
24/7 website availability and seamless digital operations management
Jeppesen relies on PagerDuty to keep their site running at all times and notify the right on-call resources to take effective and immediate action whenever an incident arises. “PagerDuty enables us to make our site available 24/7. With the PagerDuty platform, we are able to address incidents right away which in turn allows us to act more proactively,” said Castillo. Jeppesen also recently implemented and intends to heavily use PagerDuty’s Live Call Routing capability, which enables any individual to immediately reach a live on-call engineer or leave a voicemail that is attached to an incident simply by calling a number. With PagerDuty, Jeppesen has gained the full-stack visibility and response orchestration required to manage the end-to-end digital experience for their customers, resulting in optimizing product delivery and meeting SLA expectations.
“PagerDuty enables us to deliver 24/7 website availability. With the platform, we are able to address incidents immediately, which enables our IT teams to act proactively in resolving issues”