This is a guest post by Ilan Rabinovitch, Director of Product Management at Datadog. The convergence of rapid feature development, automation, continuous delivery, and the shifting...by Ilan Rabinovitch
August 24, 2017
Outages are chaotic, and it can be difficult to figure out the best way to let your customers know what is going on. One of the first big decisions you’ll need to make is whether you’re going to respond only to people who inquire about the issue, or if you’re going to be more proactive and post updates publicly. Many of the leading technology companies have begun to transparently discuss outages with their customers, and there are a number of good business reasons for doing so. Regardless of your approach, here are 6 things you can do to ensure successful customer communication during outages.
It’s important to let your customers know that you are aware of the issue and at work on a solution. The initial notification can take many forms: a maintenance page on your website, social media post or update to your Status Page, or perhaps just an internal outage communication to your customer support team. Not every incident requires widespread disclosure, but if your outage is affecting your clients, you should get your initial communication out as quickly as possibly. Proactive communication can stave off a mob of irate users and helps you get ahead of the problem.
Every minute counts, so don’t waste time figuring out that you have an incident in the first place. Make sure you have a reliable system that lets you know as soon as something breaks. We obviously have strong opinions in this area, but getting ahead of the problem is a crucial part of customer communication. You can’t convey what you don’t know.
[status] Investigating: We’re investigating a brief traffic interruption that occurred a few minutes ago, more … http://t.co/Xew9prkyvW
— StatusPage.io (@StatusPageIO) February 27, 2015
There are a few pieces of information your customers will be looking for both during and after the crash, but you don’t have to know everything to start getting the word out. You shouldn’t presume to know the cause of the problem, if you are still investigating. However, as more information becomes available, be sure to let customers know the root. A sanitized summary is fine here. You can save detailed technical messaging for your internal team. Customers will also want to know the problem’s severity and effects, as well as any short-term workarounds you might be putting in place.
Make sure you set the right tone. Be concise, authoritative, and serious. Your customers will not appreciate you being jokey or cute when their business is affected. Instead, show humility and consideration, and don’t be afraid to accept fault. Blaming external forces for your outage sends the message that you are not in control.
You might only hear from a fraction of the people affected by the outage, since data shows that ninety-six percent of unhappy customers don’t reach out directly. However, proactively letting customers know about the incident can go a long way to build customer trust. Customers value honesty, and, in fact, talking about your outages might generate sales as it will grow your image as a forthright business partner. And don’t forget that your customers might have clients of their own and the information you share helps them fight their own customer service fires. Keeping them up to date signals that you are in control and that you are handling the issue efficiently. It’s natural to be nervous about admitting to an error, since it could make unaffected and potential customers aware of your outage. However, bad word of mouth will do the same thing, and can ultimately be more damaging to your business. If transparency is the route you choose, there are a number of ways you can get the word out.
For your top customers, it may be worthwhile to be proactive, and reach out to them personally. Don’t fail to respond to your customer service tickets, too. Especially in the event of an outage, quick responses are important to reassure customers. It also pays – 3 in 5 Americans (59%) would try a different service for a better customer service experience.
Status pages, like StatusPage.io, are a great way to quickly publish updates about your uptime. They give your customers and internal support teams a single, authoritative place to figure out what is going on. PagerDuty and StatusPage.io customers can use our API to automatically update their internal and customer-facing status pages.
Depending on severity, you may want to go further. Social media offers a widespread platform for communicating with your customers. Don’t worry about people seeing that you’ve failed. Seventy-one percent of tweets are ignored, and it only takes two and a half hours for a Facebook post to get seventy-five percent of its total impressions. But customers that are looking for information will see your updates and know that you are on it. Choose your social media wisely. Twitter is a good platform for B2B businesses, but Facebook might be more relevant for a consumer-facing business. Here at PagerDuty, we have a separate Twitter feed just for outage updates, and we only post about outages our main Twitter feed if they might cause major concern.
PagerDuty systems are 100% up, but some european customers may be experiencing connectivity issues due to the Telia transatlantic outage — PagerDuty (@pagerduty) May 19, 2014
You could also go to where your customers are. Post on HackerNews, or another forum where you know your users are hanging out. It’s a good opportunity to make a team member personally available to customers in a public forum. It gives you a chance to be empathetic and sincere, as well as instantly provide a channel (like an email address) for further resolution. This can help you build a reputation for developer outreach and support.
Long silences will only frustrate your customers, even if they occur in the middle of the night. How are they to know, after an hour of silence, that there’s a human still awake and working on the issue? They might otherwise assume that everyone has given up and gone to sleep. If you have new information, send it out right away. Make sure that communication is coming out at steady intervals even if you don’t have anything new to report and are simply conveying that your team is still working on it. Regular information signals to your customers that you’re on top of the situation, and remember that your customers might have customers of their own, and they might be fighting the same fire.
Update: There has been some slowness and connectivity issues for the past few hours. We should be operating at full speed again. — PagerDuty Support (@PagerDutyHelp) June 4, 2014
We are still investigating the issue with delayed alerts. Our apologies for the inconvenience. We will update you here once resolved. — PagerDuty Support (@PagerDutyHelp) June 4, 2014
Update: Incoming/outgoing alerts are being processed as expected. Webhooks and hand-off notifications are in progress. — PagerDuty Support (@PagerDutyHelp) June 4, 2014
Update 2: All PagerDuty services were processing at full speed yesterday evening. Check our blog for an upcoming post-mortem analysis. — PagerDuty Support (@PagerDutyHelp) June 5, 2014
Even after you’ve fixed your outage, you should keep the lines of communication open. Continue to respond promptly. If you’ve promised some sort of restitution, be sure to deliver quickly. You’re trying to restore customer trust, and keeping them waiting for you to ameliorate the situation is only going to reinforce the idea that you struggle to deliver. And you want to maintain customer trust – it’s 30 times cheaper to keep an existing customer than it is to get a new one. Leading technology companies have begun publicly posting their post-mortems to let their customers know the steps they are putting in place to ensure that the incident won’t happen again. If you do this, we recommend posting within 72 hours to get the information out quickly.
Good customer communication is only part of the story. In our next post in this series, we cover best practices in communication within the incident response team. Still to come: internal stakeholders.