| In Community, Events, On-Call Life

I am not always a fan of “Women in Tech” events. So imagine my surprise to find myself spearheading a Women’s Leadership Circle here at PagerDuty. How do women-focused career events actually help?

Democracy: the great experiment. The voice of the people leading. The end of rigid and overbearing hierarchies. These principles have been with us for over two centuries in government, but many business models still look like the British Empire. As the pace of development continues to scale and customers come to expect real-time response to their concerns, businesses with complex IT departments are transitioning to a DevOps model that gives them the agility to stay up and responsive to the voice of the people. Here we explore how fostering a DevOps culture can build a more democratic workplace and customer experience.

Operations teams are receiving more telemetry data from monitoring systems than ever before. But they are struggling to sift through this data to find what really matters – resulting in alert fatigue and missed alerts. For this reason, we’re proud to announce that long-time partner Event Enrichment HQ is joining the PagerDuty family to deliver the industry’s first integrated event management and incident resolution platform. Adding Event Enrichment HQ and its keystone product, the Event Enrichment Platform (EEP), to PagerDuty helps you quiet your noisy monitoring systems, reduce alert fatigue, and slash your incident resolution times.

No one should need to be convinced the value of good data. It gives you the confidence to make decisions quickly and with less risk, it allows you to measure your success, and it lets you know when you need to adjust your course. But there’s a difference between knowing the value of data, and creating a culture around it. A data-driven culture is a culture where everyone quantifies their actions as much as possible, and asks themselves how their teams are having a tangible impact on the business. It turns your entire organization into a squad of analysts. But fostering a data-driven culture isn’t always easy. Here are five steps that will help you get there.

| In Alerting, On-Call Life

Something goes wrong in your staging environment, and you start seeing “CRITICAL” or “ERROR” all over the place. Oh… I forgot to mention that it’s 3am where you live. Is it really “critical” in that moment? Well, technically it is. The environment is still busted. But do you want to fix it now? Is it urgent?

| In On-Call Life

We know that alert fatigue is a big concern for our users. When everything is important, nothing is important. But “non-critical” is not the same thing as “insignificant”; in fact, non-critical issues are often indicative of a larger problem down the road. So now, with Incident Urgencies, users can confidently track all events, and only get woken up for the most important ones.
A big part of what has made PagerDuty useful for our customers is analytics, and being able to see what’s going on with events across all of their systems and monitoring tools. Keeping non-critical events out of PagerDuty means those analytics are only telling part of the story. And the more data you have, the easier it is to prevent incidents from occurring in the future.

Too many companies take the happiness of their engineers for granted. This is a huge mistake, especially since engineers are doing important work for your company: building your product, and then keeping it up-to-date and functioning. Their morale has a direct influence on their performance, and, by extension, your product. Part of the DevOps ethos is getting engineers working together better, smarter, and happier. But why should executives care about that?

| In Alerting, On-Call Life

Using ticket systems can be fraught with issues: a clunky workflow, mired in process, means that users can’t always move and adapt quickly. While ticketing systems are a great way to manage a ticket queue of ongoing requests, we’ve noticed that many operationally mature companies stay away from ticketing systems for their real-time incident management. Instead, they are using a more lightweight solution, like PagerDuty. A lightweight solution, with a focus on automation, allows them to be more agile, and get things done faster.

| In Alerting, Announcements, Community, Features, On-Call Life

We’re pleased to announce our fourth major mobile release, which brings some significant improvements to the performance and usability of key parts of the app. With all these changes, it’s faster and easier than ever to see, investigate, and take action on problems in your system — driving down resolution time and helping your team improve your operations performance.

Etsy occasionally runs an engineer exchange program, where they trade engineers with another tech company to give both organizations insight into what the other does differently. PagerDuty was their most recent participant, and in May, I had the pleasure of spending a week at Etsy’s office in Brooklyn. I learned from their practices, observed what they were doing well, and gained insight into their team dynamics. Etsy has an amazing culture, and I observed the customs they put into place to maintain their environment of empathy, autonomy, and learning. It was a great example of the traditions a company can foster to maintain a productive and happy work environment.

Everyone wants to optimize their team’s performance, but coming up with a good plan for doing so isn’t always easy. That’s why operationally mature DevOps teams use metrics to gain valuable insight into their work, enhance the their capacity, and drive cultural change. Here we outline the key metrics that you should be monitoring and talk about how they can influence your team’s culture and performance.

Whether your server’s CPU is pegged at 100% or someone is chopping down your rainforest, PagerDuty has no opinions on how you use our platform to trigger a response from your on-call team. But here’s one area where we do have a strong opinion: alerting on business metrics. You should do it.

This is a guest blog post written by Anthony Gibbons, the Operations Manager at Airhead Education. Anthony gives his perspective as a startup setting up PagerDuty as their IT Operations Software: “With the advent of cloud services and companies willing to integrate with each other, it is now entirely possible for a small startup to use the same monitoring tools as industry stars such as Airbnb, Pinterest and Path… It probably took me an hour to integrate all of my services with PagerDuty.”

| In Features, On-Call Life, Reliability

No matter what team you’re on, PagerDuty helps you resolve incidents faster. DevOps involves collaboration across multiple teams for better reliability and quality assurance. Having a central, shared tool like PagerDuty to manage incidents across the company makes that collaboration a heck of a lot simpler. Our new team organization feature makes it even easier for different teams like Operations, Development, and Customer Support to work together. Here’s how

| In Community, Events, On-Call Life

We hosted our first user group last week at PagerDuty HQ! Not only did we gather our awesome customers and enjoy the taco bar and cervezas, but we got to learn a lot from our them, share our roadmap – and our customers learned from each other, too. We really value user feedback as part of how and why we build our product. We wanted to share some key takeaways from our sessions during the event.