PagerDuty Drives Auto Trader UK’s Incident Response

Size: 500 – 1,000+ Employees

Industry: Internet

Location: Manchester, United Kingdom

Customer Since: 2016

Key Integrations:

SolarWinds

Auto Trader UK is the largest digital automotive marketplace in the UK and Ireland, attracting an average of 55 million platform visits every month from consumers searching and viewing car, van, and bike advertisements from almost 14,000 UK retailers. “We’re a business that’s based on the web, so we need to make sure our shop is open 24/7,” said Ryan, Senior Operations Engineer at Auto Trader UK.

Through the continuous evolution of its digital platforms and innovation of its data products, Auto Trader UK makes the car buying process easier for its customers. But maintaining a reliable, faultless platform while simultaneously undergoing a public cloud migration can be difficult, making it more crucial than ever for Auto Trader UK’s operations team to be responsive and proactive when issues arise. As Auto Trader UK continues its public cloud migration, PagerDuty helps provide the company with flexibility in how they manage incident response, ensuring they can immediately take action and resolve incidents the second they arise.

No Alert Lost or Left Behind

The operations team manage and monitor the infrastructure for the entire enterprise. They are the first responders for alerts regarding Auto Trader UK’s systems, engaging with product development teams to resolve issues as needed. “As soon as the developers deploy, we do a lot of the maintenance to ensure application health,” Ryan explained. “If something goes wrong, we communicate with the dev teams and provide diagnostics to help them resolve it.”

One of the challenges the team faced were instances of email alert notifications being either delayed or never received. “Occasionally email alerts would arrive 10 or 20 minutes after the incident actually started,” Ryan shared. “Worse, sometimes we wouldn’t get the email at all which would result in a delayed response to an incident.”

By leveraging the SolarWinds integration—one of more than 300 integrations available with PagerDuty—the team could directly receive alerts within PagerDuty, entirely eliminating email alerts from SolarWinds. As a result, the team mitigated risk of alert delay/loss. Additionally, with PagerDuty as the main alerting and notification platform, the team can respond faster with more confidence than ever before. “All of our monitoring ties into PagerDuty,” he explained. “We have a good idea of what the incident is, based off the context embedded in the message of that alert. When we fix it, it’ll auto-resolve.”

“We love PagerDuty because it works for us now, and it’ll work for us in the future as we deploy it to different squads.”

–  Ryan O’Gorman, Senior Operations Engineer, Auto Trader

Driving More Accountability on the Public Cloud Migration Journey

For Auto Trader UK, using PagerDuty is one step in a larger movement to embrace a DevOps culture. “The plan is to move away from a centralized management model and instead distribute alerts to the appropriate development teams. Meaning in the future we have the option to bring them on call so they have more ownership of their product, especially when it goes live in production,“ Ryan said.

The shift to a more decentralized incident response model is especially relevant to Auto Trader UK’s public cloud migration. The company is moving from a traditional on-premises environment to Google Cloud for more flexibility and scalability. “As we migrate to multi-public cloud environment, primarily on Google’s Cloud, a whole new set of tools and monitoring systems will spring up, and we can integrate those with PagerDuty,” stated Ryan.

Since the PagerDuty platform is so versatile, the team has the flexibility to add more development teams when the organization is ready. “We love PagerDuty because it works for us now, and it’ll work for us in the future as we deploy it to different squads. If we decide to change the company structure in the future, PagerDuty will help facilitate that,” he said.

Improving Work-Life Balance

Because PagerDuty captures every alert, the team can now easily stay on top of incidents and respond directly from the PagerDuty mobile app. Work-life balance has improved since the team can manage their own schedules and take action without having to disturb anyone else. “If we need someone to cover an overnight shift for maintenance, we can reroute alerts automatically with PagerDuty and make that transition silently,” explained Ryan. “That’s much better than waking someone at night, just to have them turn off alerts.”

Ensuring Customers Have a Seamless Experience

Aside from mitigating downtime risk and improving team health, the team also uses PagerDuty to proactively communicate with Auto Trader UK customers during incidents. The team uses PagerDuty’s StatusPage.io integration to automatically share updates when issues arise, providing more transparency to end users. “We wanted a more intuitive way of informing our customers whenever an outage occurs,” Ryan shared. “Alerts that go to our StatusPage service in PagerDuty will automatically generate an incident on our status page with relevant information, so our customers know we are already working on the issue. And once the incident is resolved, PagerDuty will resolve the incident within our StatusPage”. So, instead of the team having to manually to create StatusPage incidents as part of their investigation, PagerDuty automatically makes end user notifications a slick and simple process for Auto Trader UK.

To learn more about what PagerDuty can do for your organization and sign up for a free trial, visit www.pagerduty.com.