This week’s AWS Summit in New York was an exciting one for both AWS and PagerDuty. The AWS team rolled out Amazon EventBridge, a set...by Andrew Marshall
July 11, 2019
I love writing software, but I hate dealing with bugs. They take you away from what you want to be doing and often lead you into a rabbit hole. At Sentry—an open-source error tracking platform that provides complete app logic, deep context, and visibility across the entire stack in real time—we have a few tips that we’ve honed over time to make error resolution painless (ok, less painful), including an official integration with PagerDuty.
You’ll find a list of those best practices below. We wish you a quick trip down the where-is-this-bug-and-how-can-I-fix-it rabbit hole the next time you’re notified at 3 a.m.
So you stumble, in your 3 a.m. deep-sleep haze, to your computer and try to think clearly about the PagerDuty alert you just received. The first thing you’ll want to do is identify the impact of the issue.
Is this an issue only affecting Internet Explorer? Only customers on a particular datacenter? You know the best impact questions to ask, and now is the time to do it.
Sentry implements a system called tags—various key/value pairs that get assigned to an event—that are summarized at the issue level. You can save time and stress by avoiding the back-and-forth with your customers about their browser information by allowing tags to uncover the bug’s hotspots.
If the cause of the issue isn’t obvious (and it’s probably not or you wouldn’t have shipped it in the first place), it can be helpful to understand how your user got there. Sentry automatically tracks the path your user took as breadcrumbs. Of course, you could also manually stitch it together by searching through your logs, but that sure does sound like a pain.
At this point, you’ve identified the rabbit hole and taken a step or two inside. Now take a few more steps. And then a few more.
Ideally, you’ve already reproduced the bug, and now you can dig into the code to find out exactly what went wrong and why. The key to answering those questions is context—you’re going to need as much as you can get to find your way out of this hole.
One way to gain that context is by taking a look at your stack trace, which presumably gives you an idea of where the exception was thrown. Stack traces should give you insight into the sequence of events that lead to the bug as well as the line of code where you can find the bug. So helpful!
Want even more context? Sentry gives you a look at your stack trace, but also enhances it with everything you wish it had, including your un-minified source code and stack locals.
Sure, once you locate the bug and know what’s causing it, you could fiddle around until you fix it. Or not. Or you could go back to doing whatever you were doing before (eating, sleeping, enjoying life) by letting the right person handle the bug. Delegating bugs may not be the most enjoyable task, but it should ultimately simplify and expedite the error resolution process.
As PagerDuty customers, we are able to add developers as Stakeholders or responders accordingly.
Sentry provides deep integrations with source code management platforms to uncover the commits that likely introduced the error—suspect commits.
Sentry also suggests the developer who can best fix the problem.
Let’s be honest—not all errors are worth being woken up for. If you’re getting unactionable notifications, do something about it. Use your tools to your advantage.
Perhaps you only want to get notified via PagerDuty if an error impacts more than 100 customers in one minute. Maybe on another, you want to be notified whenever very specific types of errors are being thrown. You can easily configure all of this via Sentry’s alert rules.
While these best practices can be done outside of Sentry and PagerDuty, why bother with anything other than the very best? After all, tens of thousands of customers can’t be wrong.
And, as you’ve perhaps guessed, Sentry and PagerDuty work great together. So great, in fact, that we have an official integration with PagerDuty. Our integration sends alerts via PagerDuty for the incident response and intelligence workflows you define. Enable your development and operations teams to get a full view of errors, alerts according to escalation policy, notification urgency, and response behavior from everywhere in your app.
Neil Manvar is a Solutions Engineer Manager at Sentry. After years of software development experience, Neil moved into solutions engineering for DevOps. Always on-call and will always stop for Cinnabon.