Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
Join live and on-demand webinars for product deep dives, industry trends, configuration training, and use case-specific best practices.
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
We just held our annual conference, PagerDuty Summit 2018, where we shared new product announcements and demoed new capabilities. But while we always have big...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
Xero is a global small business platform for accountants, bookkeepers, and small businesses. Founded in 2006, the platform offers small business owners and their advisors automatic bank and credit card account feeds, invoicing, accounts payable, and standard business and management reporting.
Xero has an easy-to-use intuitive interface so that even small business owners with little bookkeeping experience can accurately account for their transactions. A comprehensive education portal as well as, award-winning customer service further support small business owners if they have questions. For its active community of accounting partners, Xero offers additional functionality, such as a practice manager, advisory tools, and an app marketplace.
With offices in the U.S., U.K., Asia, Australia, and New Zealand, Xero has more than 1.2 million subscribers in over 180 countries who rely on its software to help run their businesses. It’s therefore very important for Xero’s platform to be dependable—a responsibility that falls on the company’s developers and site reliability engineers.
Anthony Angell, one of the Site Reliability Engineer Team Leads, explained that when he joined the company a few years ago, Xero was already using PagerDuty to manage two schedules. The production environment was supported by Operations teams located in Auckland, New Zealand, and Denver, Colorado. However, as Xero continued to rapidly grow, it became increasingly challenging for the Operations team to scale and coordinate schedules and escalation policies across the two sites.
In 2016, Xero implemented a DevOps approach incorporating Site Reliability Engineering (SRE) to manage the production environment and overhauled its incident management processes. Rather than having the operations teams oversee the entire production environment, this new incident management framework relied on the teams that built the software to be available and on-call in the event of an incident—regardless of whether they were a developer or a QA engineer.
This meant many more people and teams were added to on-call schedules, and Xero needed a way to manage and scale the on-call groups, which is where PagerDuty came in. “[PagerDuty] helped us to be able to scale the on-call groups within the business quite easily,” Angell shared. “It has also given us and the business a better support structure.”
With PagerDuty, the site reliability engineering team also was able to educate many other teams about incident management and how alerting works on the platform. The result? Customers are seeing quicker resolution times because the people who developed, built, and continue to maintain the code are also the first responders should something go wrong. “The ability to get a hold of our responders in a timely fashion via different methods adds a lot of business value,” said Angell.
To further automate and scale the incident management process, Xero’s Site Reliability Engineering team leverages ChatOps to support hundreds of employees around the world. Xero’s homegrown chatbot, “Multivac,” is integrated into the company’s Slack account and leverages PagerDuty’s API to automate several critical activities within Xero’s incident management framework.
Using Multivac, Xero can onboard a new team and on-call schedule into PagerDuty by sending a request to Xero’s Github repository to automatically enable the configuration. Incident managers can use Multivac to notify the right team members to initiate the incident response process within PagerDuty and create a unique Slack channel for the incident. Users can also request status updates on recent production releases or active alerts from Multivac, which provide needed context to troubleshoot incidents more quickly. By offloading many activities to Multivac and PagerDuty, Xero has been able to respond and resolve incidents much faster.
“In a one year span, from January 2017 to January 2018, PagerDuty analytics showed us that we saw a 40 percent reduction in high-urgency alerts. Not only that, but MTTR for high-urgency alerts, the highest urgency level, is down 74 percent.”
#PeopleFirst: Improved Work-Life Balance With PagerDuty
One of Xero’s core values is “human” which put a big emphasis on people, and the company expanded its use of the PagerDuty platform by leveraging analytics capabilities to gain insight into team health. “The analytics insight is helpful for our managers—particularly those on other teams—because they can see from the data how many alerts their team received over a specific time period,” explained Angell. “This is useful for when we need to take a closer look at the reasons for engineer fatigue—for example, we want to know if on-call responders received unusually high number of alerts in a short time period, as that could put them at risk of burnout.”
Additionally, Angell’s favorite part about PagerDuty is how it gives teams flexibility and ownership when it comes to on-call scheduling. Instead of having one team overlooking everything like before, Xero now has a number of distributed teams empowered to manage their own on-call schedules. “We’ve educated a lot of teams around incident management and how alerting and PagerDuty works, and it’s actually given the business a better MTTR,” said Angell.
Xero is expanding its use of the PagerDuty Digital Operations Management platform across a broader range of users and use cases. The company has already taken some steps to evaluate team health on their own, and they hope to have more in-depth insight into how their teams are performing by adopting PagerDuty’s Operational Health Management Service (OHMS).
“In a one year span, PagerDuty analytics showed us that we saw a 40 percent reduction in high-urgency alerts and MTTR for high-urgency alerts is down 74 percent.”
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2018