PagerDuty Blog


Make Some Noise with Our Custom Alert Sound Contest @pagerduty @AWSreinvent #PickYourPage

We’re excited to announce our first-ever custom alert sound contest! Beginning September 21, 2015, we will accept submissions for a chance to be included as an alert sound in our mobile app. We have a great community, and we want to see them get creative. Or ironic. Or immature. Songs, clever noises, avant-garde recordings of one hand clapping – all are welcome. Send your best creation to pickyourpage@pagerduty.com.


In Announcements, Community, Events


Incidents with a Volume Knob: Introducing Incident Urgencies!

We know that alert fatigue is a big concern for our users. When everything is important, nothing is important. But “non-critical” is not the same thing as “insignificant”; in fact, non-critical issues are often indicative of a larger problem down the road. So now, with Incident Urgencies, users can confidently track all events, and only get woken up for the most important ones.
A big part of what has made PagerDuty useful for our customers is analytics, and being able to see what’s going on with events across all of their systems and monitoring tools. Keeping non-critical events out of PagerDuty means those analytics are only telling part of the story. And the more data you have, the easier it is to prevent incidents from occurring in the future.


In On-Call Life


Why VPs Should Care About Engineer Burnout

Too many companies take the happiness of their engineers for granted. This is a huge mistake, especially since engineers are doing important work for your company: building your product, and then keeping it up-to-date and functioning. Their morale has a direct influence on their performance, and, by extension, your product. Part of the DevOps ethos is getting engineers working together better, smarter, and happier. But why should executives care about that?


In On-Call Life, Operations Performance


The Most Adorable On-Call Tale There Ever Was…

One day, Ethan, whose dad works at Altiscale, heard a sweet song. It was an infectious tune; he couldn’t get it out of his head. Over and over, he heard this song, wafting again and again from his father’s phone. What was this magnificent melody? When would it play again? The song was, technically speaking, a PagerDuty alert: a jingle by the name of “You Made the Server Cry,” recorded Barbershop Quartet-style by some of PagerDuty’s more musical employees. Five-year-old Ethan thought the song was so amazing, he found himself singing it all the time. Pretty soon, he was making up his own PagerDuty alert sounds, and came up with a ditty called, “Something’s Broken,” sung to the tune of “Frère Jacques.” His dad decided to record it and submit it to us as a custom alert sound.


In Alerting


Do You Need Your Ticketing System for Real-Time Incident Management?

Using ticket systems can be fraught with issues: a clunky workflow, mired in process, means that users can’t always move and adapt quickly. While ticketing systems are a great way to manage a ticket queue of ongoing requests, we’ve noticed that many operationally mature companies stay away from ticketing systems for their real-time incident management. Instead, they are using a more lightweight solution, like PagerDuty. A lightweight solution, with a focus on automation, allows them to be more agile, and get things done faster.


In Alerting, On-Call Life


Three Ways to Ramp Up Your Enterprise IT Operations Management

As indicated in a survey conducted by Forrester Research, a well-constructed IT Operations management system provides fast alert notification, keeps business-critical incidences from occurring at a minimum, and focuses on automation as a way of addressing issues. What we are actually seeing in the field today, however, doesn’t seem to line up with this approach. According to a recent Forrester thought leadership paper, incident resolution practices today are tactical, reactive, and harm commercial success. Listed below are some observations we are seeing with IT Organizations in the Enterprise.


In Community, Events, ITOps & Modern Ops, Operations Performance


It's a Match! Swipe Incidents with PagerDuty Mobile App Update

We’re pleased to announce our fourth major mobile release, which brings some significant improvements to the performance and usability of key parts of the app. With all these changes, it’s faster and easier than ever to see, investigate, and take action on problems in your system — driving down resolution time and helping your team improve your operations performance.


In Alerting, Announcements, Community, Features, On-Call Life


PagerDuty Customer Support & Advocacy Team Wins Stevie® Award

We are delighted to announce that our Customer Support and Advocacy team won the Silver Stevie® Award in the Customer Service Department of the Year category in the 2015 International Business Awards. The award demonstrates PagerDuty’s commitment to its customers, as evidenced by a satisfaction rating that averaged 98.3 percent throughout 2014.


In Announcements, Community, Reliability


On-Call Best Practices: Page Your Manager

Having one person on-call isn’t enough. What happens if your on-call engineer sleeps through their alert? What happens if their phone’s battery dies without them knowing, or if they get an alert at a really inconvenient time, like when stuck on a bus or in traffic? It will happen. We present best practices for back up. One or more people, waiting in the wings, ready to spring into action if your primary on-call is unable to perform his or her duties to the best of their abilities at any given time.


In Alerting, Best Practices & Insights, Operations Performance


How Etsy Drives a Culture of Empathy, Autonomy, Learning

Etsy occasionally runs an engineer exchange program, where they trade engineers with another tech company to give both organizations insight into what the other does differently. PagerDuty was their most recent participant, and in May, I had the pleasure of spending a week at Etsy’s office in Brooklyn. I learned from their practices, observed what they were doing well, and gained insight into their team dynamics. Etsy has an amazing culture, and I observed the customs they put into place to maintain their environment of empathy, autonomy, and learning. It was a great example of the traditions a company can foster to maintain a productive and happy work environment.


In Alerting, Community, On-Call Life, Partnerships


Important Security Announcement From PagerDuty

Our customers and community are very important to us, and to maintain the transparency that is essential to keeping your trust, we wanted to tell…


In Announcements


New Updates to Advanced Analytics

We’re pleased to announce improvements to our reporting capabilities that enable teams to gain even greater insight. Now, teams can optimize their monitoring by visualizing metrics such as common incidents, SLA performance, and noisy incidents.


In Alerting, Announcements, Features, ITOps & Modern Ops, Product


#HugOps in Practice: Empathy Skills for DevOps

We think we’re doing the whole DevOps thing right — new hires can deploy on day one, Travis CI is humming along, and we own the code we ship. But then something breaks, something doesn’t go according to plan, tempers flare up, and all that warm, fuzzy collaboration seems to evaporate. What’s going on? What happened to #HugOps?


In DevOps, HumanOps


PagerDuty + Opsmatic = Faster incident resolution

Opsmatic provides real-time visibility of any change to the live state of your infrastructure and intelligently alerts you before trouble begins. The recent addition of Assertions gives you a precise way to check and enforce policy across all your hosts. It’s only natural that Opsmatic has partnered with PagerDuty to ensure flawless alerting and effective incident collaboration. PagerDuty’s operations performance platform ensures that the right people on your team get alerted and can resolve incidents before they become emergencies.


In Announcements, Community, Partnerships


Lessons from Virtuoso: Three Steps You Can Take to Reduce Alert Volume by up to 94% in Three Weeks

We recently sat down with Shawn Motley, Senior DevOps Engineer at Virtuoso, to talk about his experiences with PagerDuty and the Event Enrichment Platform (EEP). Virtuoso is a travel portal for high-end clients, with over 200 employees and 8 web properties. When Virtuoso began focusing on their DevOps initiative 7 months ago, they were receiving thousands of events every 24 hours, the majority of which were noise. Learn how they reduced their alert volume by 94% in 3 weeks with PagerDuty and Event Enrichment by following 3 easy steps.


In Alerting, Partnerships


What is Operational Maturity?

Long-time PagerDuty customers Dropbox, Flipboard, and Splunk spoke about their hard-won experience, shared war stories, and discussed what they’ve learned about operations at scale. They also had advice about how what they’ve learned can be applied to other teams. We were delighted to talk with customers, partners, and the extended community about what it means to be operationally mature. Here is what was said about Operational Maturity.


In Community


Why We Didn’t Build a Native Chat Client

Transparency and collaboration are at the core of DevOps philosophy, and ChatOps is an important aspect of both. ChatOps puts an entire team or organization’s work in one place – everyone’s actions, notifications and diagnoses happen in full view. A native PagerDuty chat client would be designed for use during incidents, and wouldn’t replace the chat client you use every day. Having two different chat records, which a native chat client would encourage, runs counter to the DevOps philosophy.


In Alerting, Features, Operations Performance


The Best Metrics for Driving Cultural Change in DevOps Teams

Everyone wants to optimize their team’s performance, but coming up with a good plan for doing so isn’t always easy. That’s why operationally mature DevOps teams use metrics to gain valuable insight into their work, enhance the their capacity, and drive cultural change. Here we outline the key metrics that you should be monitoring and talk about how they can influence your team’s culture and performance.


In Alerting, Best Practices & Insights, DevOps, On-Call Life, Operations Performance