Flixbus Drives Operational Efficiency With PagerDuty

Size: 1,000 – 5,000 Employees

Industry: Transportation

Location: Munich, Germany

Customer Since: 2017

Key Integrations:

New Relic
AWS CloudWatch

Flixbus, Europe’s largest long-distance bus service founded in 2013, is a unique combination of tech startup, e-commerce platform, and sustainable transportation company. It quickly became the leading long-distance travel provider within Germany before expanding to other European countries in 2015, and to the United States in 2018. On a daily basis, more than 300,000 passengers travel to over 1,700 destinations across 28 countries.

Flixbus is revolutionizing traditional bus travel by providing user-friendly features, including the Flixbus App, e-ticketing, GPS live tracking, and the automated Delay-Management System, all of which run in real time. And with so many transportation options in Europe, the developers at Flixbus are constantly delivering more features to remain competitive. “The DevOps team is probably the most important team when it comes to business continuity because they are the first line of defense when it comes to maintaining the customer experience,” said Jasper Spruytte, Engineering People Lead at Flixbus. “If something isn’t working, then we don’t have a platform to sell tickets or check in passengers.”

For Spruytte, this means that his Ops and DevOps teams must have 24/7 coverage across all digital channels, with real-time visibility into system performance. But with regulations governing overtime compensation, Flixbus must manage on-call schedules and escalations efficiently and effectively to keep costs low.

Accelerating Response Helps Flixbus Go the Distance

Before PagerDuty, the on-call process was manual and time-consuming. Once the appropriate people were notified, the response was further delayed due to lack of visibility into application performance.  “The modus operandi before PagerDuty went like this: Somebody knows something doesn’t work, people get pinged on Slack, and someone will eventually respond and fix it. This was a horrible process,” Spruytte shared.

Flixbus envisioned a more modern approach that could empower teams to respond in real time and fix issues much faster so that its passengers could continue to travel relaxed and stress-free. “One of our board members proposed PagerDuty,” explained Spruytte. “PagerDuty has a good resume of customers, so we opted for that. We did a small pilot in the beginning and then quickly adopted it into our system.”

The teams Spruytte oversees make good use of PagerDuty’s 300+ integrations and custom APIs to monitor their applications, most notably New Relic and AWS CloudWatch. Flixbus also monitors Kubernetes clusters and Adyen payment processing. After PagerDuty was deployed, mean time to respond significantly improved—from 2016 to 2017, the teams saw a 60 percent decrease in high-urgency incidents.

Taking On-Call Compensation in a New Direction

Beyond incident response, Flixbus will be using the PagerDuty platform to automate its revamped on-call compensation program. Typically, organizations pay overtime as a set percentage on top of salaries for their on-call resources. But the teams at Flixbus wanted more options in terms of compensation for on-call rotations, and Spruytte wanted to provide incentives for teams to respond faster.

So he created a program that converted on-call compensation into points, where responders earned points for different scenarios. For instance, a person on call would automatically receive 200 points and would earn varying point levels depending if an incident occurred off hours during the week or on the weekend. If other people were added to the incident response, they would also earn points, thereby encouraging collaboration and resolving the incident faster. These points could then be used in a marketplace where people could choose additional paid time off, cash payouts, gift cards, or other options.

Currently, the points-based system is manually tracked and updated in a spreadsheet for each on-call person. Since PagerDuty’s scheduling capabilities automatically track when resources are on call and actively working incidents, Spruytte plans to integrate the points-based system with PagerDuty for easier tracking and management.

“With PagerDuty behind the wheel of this cultural shift, Flixbus plans to maintain its leading position in long-distance bus travel.”

– Jasper Spruytte, Engineering People Lead, Flixbus

The Road Ahead

In addition to the teams that Spruytte manages, the internal IT and payment teams at Flixbus are also using PagerDuty, with plans to add seven more teams. With the time saved from faster incident response and improved collaboration across teams, Flixbus developers can devote more time to innovation.

“PagerDuty has proven its worth already. Clarity has been improved, so people know what they need to do and how fast they have to respond before customers are impacted. The philosophy we want to have is that every team should feel responsible for ensuring the uptime or continuity of their component or product within the framework that we create.”  With PagerDuty behind the wheel of this cultural shift, Flixbus plans to maintain its leading position in long-distance bus travel.

To learn more about what PagerDuty can do for your organization and sign up for a free trial, visit www.pagerduty.com.