Blog

Alerting, Announcements, Community, Features, On-Call Life

Event Enrichment Platform (EEP) Joins PagerDuty to Cut Alert Fatigue

Operations teams are receiving more telemetry data from monitoring systems than ever before. But they are struggling to sift through this data to find what really matters - resulting in alert fatigue and missed alerts. For this reason, we’re proud to announce that long-time partner Event Enrichment HQ is joining the PagerDuty family to deliver the industry’s first integrated event management and incident resolution platform. Adding Event Enrichment HQ and its keystone product, the Event Enrichment Platform (EEP), to PagerDuty helps you quiet your noisy monitoring systems, reduce alert fatigue, and slash your incident resolution times.

3 min read

Community

Easy Salesforce Deployments Using Slack and GitHub

Deploying in Salesforce requires a lot of steps. Each one must be performed manually and the entire process is prone to error. If you want to deploy to multiple environments then you have to repeat this entire process for each one. I wanted to use the same deployment system that we use for everything else at PagerDuty: n in-house deployment system named Igor controlled through our Lita bot called OfficerURL (URL for short), which allows us to deploy with a single command in Slack.

5 min read

Alerting, DevOps

Four Things We Learned about DevOps in London

Today’s customer expects everything to be fast and always on, so uptime is crucial. This creates an entirely new set of business challenges for organisations with complex IT departments and a need for more agile IT Ops. We reviewed the top trends we are learning from customers on the road and what you need to consider when transitioning from a more traditional IT organisation.

3 min read

Alerting, On-Call Life, Operations Performance

Five Ways to Create a Data-Driven Culture

No one should need to be convinced the value of good data. It gives you the confidence to make decisions quickly and with less risk, it allows you to measure your success, and it lets you know when you need to adjust your course. But there’s a difference between knowing the value of data, and creating a culture around it. A data-driven culture is a culture where everyone quantifies their actions as much as possible, and asks themselves how their teams are having a tangible impact on the business. It turns your entire organization into a squad of analysts. But fostering a data-driven culture isn’t always easy. Here are five steps that will help you get there.

3 min read

Alerting, Announcements, Community, Partnerships

Give Silent Failures a Voice with Dead Man’s Snitch and PagerDuty

Don't let the hardboiled-sounding name of our latest integration scare you off, because this monitoring service is a great way to get notified when one of your mission-critical scheduled tasks suddenly sleeps with the fishes. Dead Man’s Snitch is an uptime-monitor for cron or periodic jobs like backups or batch processing, and it alerts you when your jobs don’t run so you can investigate before it becomes a problem.

3 min read

Alerting

Up Your Coaching Game with User Reporting

Introducing User Reporting, the latest addition to PagerDuty’s Advanced Analytics suite. User Reporting helps managers and teams understand how individual team members are responding to incidents. Now managers can see how many incidents each responder has received, acknowledged, reassigned, or moved up the chain of command due to non-acknowledgement. With this information, managers can work with their teams to make sure every team member is in the right position and that workload is spread properly across the team.

4 min read

Announcements, Community, Events

Make Some Noise with Our Custom Alert Sound Contest @pagerduty @AWSreinvent #PickYourPage

We’re excited to announce our first-ever custom alert sound contest! Beginning September 21, 2015, we will accept submissions for a chance to be included as an alert sound in our mobile app. We have a great community, and we want to see them get creative. Or ironic. Or immature. Songs, clever noises, avant-garde recordings of one hand clapping - all are welcome. Send your best creation to pickyourpage@pagerduty.com.

2 min read

On-Call Life

Incidents with a Volume Knob: Introducing Incident Urgencies!

We know that alert fatigue is a big concern for our users. When everything is important, nothing is important. But “non-critical” is not the same thing as “insignificant”; in fact, non-critical issues are often indicative of a larger problem down the road. So now, with Incident Urgencies, users can confidently track all events, and only get woken up for the most important ones. A big part of what has made PagerDuty useful for our customers is analytics, and being able to see what’s going on with events across all of their systems and monitoring tools. Keeping non-critical events out of PagerDuty means those analytics are only telling part of the story. And the more data you have, the easier it is to prevent incidents from occurring in the future.

3 min read

On-Call Life, Operations Performance

Why VPs Should Care About Engineer Burnout

Too many companies take the happiness of their engineers for granted. This is a huge mistake, especially since engineers are doing important work for your company: building your product, and then keeping it up-to-date and functioning. Their morale has a direct influence on their performance, and, by extension, your product. Part of the DevOps ethos is getting engineers working together better, smarter, and happier. But why should executives care about that?

3 min read

Alerting

The Most Adorable On-Call Tale There Ever Was…

One day, Ethan, whose dad works at Altiscale, heard a sweet song. It was an infectious tune; he couldn’t get it out of his head. Over and over, he heard this song, wafting again and again from his father’s phone. What was this magnificent melody? When would it play again? The song was, technically speaking, a PagerDuty alert: a jingle by the name of “You Made the Server Cry,” recorded Barbershop Quartet-style by some of PagerDuty’s more musical employees. Five-year-old Ethan thought the song was so amazing, he found himself singing it all the time. Pretty soon, he was making up his own PagerDuty alert sounds, and came up with a ditty called, “Something’s Broken,” sung to the tune of “Frère Jacques.” His dad decided to record it and submit it to us as a custom alert sound.

2 min read

Alerting, On-Call Life

Do You Need Your Ticketing System for Real-Time Incident Management?

Using ticket systems can be fraught with issues: a clunky workflow, mired in process, means that users can’t always move and adapt quickly. While ticketing systems are a great way to manage a ticket queue of ongoing requests, we’ve noticed that many operationally mature companies stay away from ticketing systems for their real-time incident management. Instead, they are using a more lightweight solution, like PagerDuty. A lightweight solution, with a focus on automation, allows them to be more agile, and get things done faster.

3 min read

Community, Events, ITOps & Modern Ops, Operations Performance

Three Ways to Ramp Up Your Enterprise IT Operations Management

As indicated in a survey conducted by Forrester Research, a well-constructed IT Operations management system provides fast alert notification, keeps business-critical incidences from occurring at a minimum, and focuses on automation as a way of addressing issues. What we are actually seeing in the field today, however, doesn’t seem to line up with this approach. According to a recent Forrester thought leadership paper, incident resolution practices today are tactical, reactive, and harm commercial success. Listed below are some observations we are seeing with IT Organizations in the Enterprise.

2 min read

Alerting, Announcements, Community, Features, On-Call Life

It's a Match! Swipe Incidents with PagerDuty Mobile App Update

We're pleased to announce our fourth major mobile release, which brings some significant improvements to the performance and usability of key parts of the app. With all these changes, it’s faster and easier than ever to see, investigate, and take action on problems in your system — driving down resolution time and helping your team improve your operations performance.

2 min read