Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
Join live and on-demand webinars for product deep dives, industry trends, configuration training, and use case-specific best practices.
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
“I need to be notified if there’s a significant event ongoing with SignalFx.” This is what I tell my team. However, despite being the CTO...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
On Wednesday night, PagerDuty hosted an event where long-time PagerDuty customers Dropbox, Flipboard, and Splunk spoke about their hard-won experience, shared war stories, and discussed what they’ve learned about operations at scale. They also had advice about how what they’ve learned can be applied to other teams. We were delighted to talk with customers, partners, and the extended community about what it means to be operationally mature. Here is what was said about Operational Maturity:
Andrew Fong, Infrastructure Manager at Dropbox:
Operationally mature cultures are ones that are able to understand the tradeoffs that they are making in a production environment and the impact that has to the business.
Joey Parsons, Head of Platform & Operations at Flipboard:
Operationally mature, from our standpoint, is understanding the ramifications of incidents from both a business impact and employee well-being perspective. Being on-call can be both a rewarding or negative experience for the person responding. Having the operational tools and processes in place to be able to make smart, informed decisions for your business is key.
Sean Jacobs, Infrastructure and Datacenter Operations Lead at Splunk:
Operational Maturity at Splunk is often measured by the effectiveness of our response during a crisis. Being a big data company, we collect information on nearly every facet of our infrastructure, but having the data and having meaningful data are vastly different challenges.
Tim Armandpour, Vice President of Engineering at PagerDuty:
Operational Maturity means being part of a test-driven environment, where high-severity incidents resulting from bugs are very uncommon, and measured. It also means being part of an organization with where every team is part of an on-call rotation and uses the same incident management system and methodology for maximum transparency and collaboration. At an Operationally Mature company, reliability and accountability are seen as key factors for a successful business. The more mature you are, the easier it is for your business to be agile, and adapt quickly and change with the market.
Our SEV process (Incident Response) at Dropbox used to be ad hoc with no clear owners other than Senior Engineers. Over the last year we’ve built out a process that identifies a clear owner for coordination and resolution. We built well-defined criteria and tooling so that we could support 350+ Engineers, as well as Product Management, Communications, and Legal. Also, at Dropbox, incidents can be both backend server issues or client issues. (We have desktop software!) So we needed to build a process that works for all.
Becoming mature had a lot to do with the evolution of our on-call and escalation policies. Monitoring is never done and needs to be continually revamped for both quality business and quality of life. Bad alerting very quickly leads to employee dissatisfaction.
A lot of effort gets put into making our alerting and monitoring useful, and not just having a blanket approach to monitoring. Additionally, we put a lot of priority into look-backs and retroactive reviews so we can iterate and improve, versus having to react to the same issues every week.
Every Friday at PagerDuty is Failure Friday, where our engineers intentionally take services offline and try to break our system, to ensure that all of our failsafes are up and running. We take reliability very seriously here, and have three active data centers so we stay online even if one of them is down. We also have a robust incident management policy, and have eliminated non-actionable alerts to the point where our on-call engineers get a few alerts per month at most.
At PagerDuty, we’re interested in how we can apply the same assets that differentiate our business—our product and people—to help social impact organizations better deliver...
This month, our #PagerDutyAMA series led us to Alice Goldfuss, a systems punk currently helping GitHub run its cutting-edge container platform. She loves kernel crashes,...
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2018