Turn any signal into insight and action. See how PagerDuty Digital Operations Management Platform integrates machine data and human intelligence to improve visibility and agility across organizations.
Connect insights to real-time action by aligning teams through the shared language of business impact.
Check out the latest products we’ve been working on—including event intelligence, machine learning, response automation, on-call, analytics, operations health management, integrations, and more.
Digital Operations Management arms organizations with the insights needed to turn data into opportunity across every operational use case, from DevOps, ITOps, Security, Support, and beyond.
Over 300 Integrations
Discover DevOps best practices with our library of webinars, whitepapers, reports, and much more.
Learn best practices and get support help with resources from our award-winning support team.
See how PagerDuty works with our live product demo — twice a week, every week.
Join live and on-demand webinars for product deep dives, industry trends, configuration training, and use case-specific best practices.
Interactive, simple-to-use API and technical documentation enables users to easily try updates and extend PagerDuty.
Engage with users and PagerDuty experts from our global community of 200k+ users. Become a member, connect, and share insights for success.
Get all your PagerDuty-related questions answered by exploring our in-depth support documentation and community forums.
“I need to be notified if there’s a significant event ongoing with SignalFx.” This is what I tell my team. However, despite being the CTO...
PagerDuty helps organizations transform their digital operations. Learn more about PagerDuty's mission and what we do.
Meet our experienced and passionate executive team.
We are risk-taking innovators dedicated to delivering amazing products and delighting customers. Join us and do the best work of your career.
With the PagerDuty Foundation, we are committed to doing our part in giving back to the community.
Imagine the frustration you feel when you’re writing something in Google Docs and you suddenly lose Internet connection. Or the panic you experience when you’re searching through your Notes app on your phone for one very particular note you typed on your computer about elephants in Djibouti so you can win the trivia game—and can’t find it.
From meeting notes to random trivia tidbits, Evernote’s job is to help people create, assemble, nurture, and share information. Our unique search capabilities allow people to find information when they need it, no matter the format it was stored in—whether in a note, image, PDF, or voice recording. Our product is a cross-platform software-as-a-service application designed to enable people to organize, personalize, consume, and share thoughts from any device at any time. We currently have over 220 million people using our product globally, and that number increases daily.
As the SRE Manager, my team of site reliability engineers are responsible for customer happiness by ensuring that our product works as intended. This means minimal downtime, but if downtime does happen, we need to act fast and resolve the issue as soon as possible.
This is where PagerDuty comes in: When I joined Evernote in 2012, we were using PagerDuty primarily for alerts and notifications, as well as on-call rotation scheduling. In 2016, we began a major evolution of our hosting infrastructure, which centered around migrating many workloads to Google Cloud Platform. By moving to the cloud, engineers were able to iterate and build services quicker than ever before.
But with this increased agility came new challenges—namely, tracking key performance indicators that tie into our service-level objectives (SLOs), which we use internally to identify which incidents have the most negative impact on the customer journey.
For example, our customers care about how long it takes to open, write, and sync a note across their devices, so when any one of those actions experiences an issue, my team needs to be aware immediately and resolve that incident as quickly as possible. On the other hand, if one server goes down and we have eight of them still running, we’ll still receive an alert. But if it doesn’t affect our customers’ experience (and our SLO), then it probably isn’t a big deal and we can plan to address it later on. PagerDuty helps with this by funneling all of our alerts and grouping them together so we can figure out what to prioritize, allowing us to look at things from the top of the funnel down versus from the bottom up. Additionally, the platform’s advanced analytics capabilities gives us a single source of truth for visibility into production issues.
As we continue to grow, we plan to expand our use of PagerDuty within the company, specifically in regards to using the available postmortem templates and incident response plays to further automate our incident response process.
Garrett Plasky is SRE Manager at Evernote. His team is responsible for running Evernote’s production service infrastructure. See the full case study to learn more about Evernote’s story.
What runs through your head when shopping for jeans online? If you’re anything like me, you’re likely contemplating how they’ll fit, whether your phone will...
Founded in 1969 and based in Seattle, Washington, SightLife is a non-profit global health organization working to prevent and eliminate corneal blindness worldwide by 2040....
600 Townsend St., #200
San Francisco, CA 94103
905 King Street West, Suite 600
Toronto, ON, M6K 3G9, Canada
1416 NW 46th St., St. 301
Seattle, WA 98107
5 Martin Place
1 Fore St,
London EC2Y 9DT
© 2009 - 2018