Application Performance Monitoring (APM) systems like AppDynamics can provide incredibly rich information about what’s happening with your IT infrastructure, and can identify performance issues before they create big problems. However, this information is only as good as your ability to respond to it. PagerDuty can extend the capabilities of AppDynamics Alert & Respond policies to ensure incidents are noticed, responded to, and fixed quickly.
PagerDuty released Multi-User Alerting in early 2014, which allowed notifying and assigning multiple people when an incident is triggered. In addition to assigning multiple users to an incident, multi-user alerting also makes it possible for an incident to have multiple acknowledgers. This post will demonstrate the changes we made to our data model to implement multi-user alerting and the resulting sophistication added to our SQL queries to maintain their performance.
In Tech Talk
Last week, PagerDuty had the pleasure of attending AWS Summit on our home turf in San Francisco. It was nothing short of epic. The Amazon Web Services crowd is definitely our niche. AWS is a critical component of the PagerDuty platform, and our three founders actually came up with the idea for PagerDuty while working at Amazon! So, naturally, we feel totally in our element surrounded by other AWS fans at these types of events.
No matter what team you’re on, PagerDuty helps you resolve incidents faster. DevOps involves collaboration across multiple teams for better reliability and quality assurance. Having a central, shared tool like PagerDuty to manage incidents across the company makes that collaboration a heck of a lot simpler. Our new team organization feature makes it even easier for different teams like Operations, Development, and Customer Support to work together. Here’s how
Today we’re announcing the integration of PagerDuty with Webmon, a website monitoring and escalation service that lets you be the first to know when an online service goes down.
PagerDuty is delighted to announce it’s heading to London for its first international conferences, ever. We’re proud to sponsor AWS Summit in London on Wednesday, April 15 and Puppet Camp London on Monday, April 13. We have customers in over 110 countries and we’re very excited about meeting with some of our 350+ UK customers.
You’ve just realized that something has gone critically wrong, and you can’t fix it yourself. Particularly if you work within a collaborative DevOps environment, it’s better to get by with a little help from you friends. Effectively coordinating the incident response across subject matter experts and front-line responders is a secret to operational success that differentiates top teams. So it’s important that you have an effective and efficient way to to sound the alarm, and make sure that your conversations are recorded and actionable.
We hosted our first user group last week at PagerDuty HQ! Not only did we gather our awesome customers and enjoy the taco bar and cervezas, but we got to learn a lot from our them, share our roadmap – and our customers learned from each other, too. We really value user feedback as part of how and why we build our product. We wanted to share some key takeaways from our sessions during the event.
PagerDuty alerts. Feeding a newborn gremlin. FOMO. These are the things that keep us up at night. Here at PagerDuty, we know that nothing settles the nerves like eye cuddling a fluffy, adorable cat. That why we’re proud to announce OkCats.
If you have a Network Operations Center (or NOC, as the kids call it), you have a skilled set of eyes monitoring your system and alerting your engineers when things go wrong. (If you have something like a NOC, such as a first tier team that processes tickets, we’re looking at you, too). You also probably have strict SLAs and a need for high availability at all times. You can’t waste a second when things go down. Solutions like PagerDuty that help you identify and resolve incidents faster can help you improve your Network Operations Center performance. These solutions can shave minutes off your time to detect incidents (one of our customers took 8 minutes off theirs) and can make it easier for NOC personnel to escalate to experts when needed. We’ve found five ways that our customers use PagerDuty to enhance their NOCs.
When your service goes down, there’s no time to waste. With sweaty palms and an elevated heart rate, you need to figure out what’s wrong, all while communicating your status to your users. Coordinating with your team is complex enough – there’s no room for unnecessary actions. This is where Flowdock’s new and greatly improved PagerDuty integration comes into play.
Outages are chaotic, and it can be difficult to figure out the best way to let your customers know what is going on. One of the first big decisions you’ll need to make is whether you’re going to respond only to people who inquire about the issue, or if you’re going to be more proactive and post updates publicly. Many of the leading technology companies have begun to transparently discuss outages with their customers, and there are a number of good business reasons for doing so. Regardless of your approach, here are 6 things you can do to ensure successful customer communication during outages.
One of the great things about PagerDuty is our API. With our API, you can integrate with a wide variety of partners, and also extend and customize your PagerDuty experience. Our customers have done a number of cool things, including creating custom reports and dashboards, creating status pages to let customers and internal stakeholders know about incidents, and automating the details of their incident response. The PagerDuty API helps you respond to incidents more efficiently. But where do you get started? We highlight some examples of cool tools.
We, as IT professionals, have ever-expanding access to more accurate Ops telemetry. With this data, we have an incredible amount of visibility into what’s going on. However, more information isn’t always a good thing when it comes to alerting. You can definitely have too many alerts, and alert fatigue is a growing problem among Operations teams. More detailed telemetry isn’t bad; it’s just that much of this information is generally better suited for forensics rather than alerting. Event Enrichment and PagerDuty team up to help you battle alert fatigue.
Streamline AWS Security Management with PagerDuty and Evident.io This is a guest blog post by John Martinez, Principal Solution Architect at Evident.io. At Evident.io, one…
Proactively Manage Application Performance with PagerDuty & Dynatrace We’re excited to announce a new integration with Dynatrace, a class-leading Application Performance Management (APM) solution. With Dynatrace’s…
Last year, we launched Single Sign-On (SSO) to make it easier and more secure to manage your PagerDuty users. We’re excited to add Google Apps as an SSO partner alongside Okta, OneLogin, Ping Identity, Active Directory, and more. Get the details in this post.