Seven Key Lessons to Keep in Mind When Communicating an IT Failure
Posted March 4th, 2008 by Jonah ParanskyIT Failures Happen.
While business and customers often expect 100% uptime, they don’t always receive it.
End-to-End IT Services Fail. Critical Applications Fail. DataCenters that we rely upon lose power due to a traffic accident. A system upgrade takes out an airport baggage handling system. A back-end infrastructure provider that many rely upon for outsourced infrastructure experiences a major outage, after years of stellar uptime records.
When failures happen, often the quality of the communication process does not match the technical resources brought to bear to solve the problem.
Here are seven key lessons to keep in mind when communicating an IT failure.
- Have a communication plan in place and ready to go
All IT service delivery teams should maintain an active business continuity or disaster recovery plan. The time to develop your communications plan is not during the outage or service failure when the entire organization if focused on dealing with the problem at hand. Any credible continuity planning process should include the development of canned messages and a communications playbook that can be used during the failure. Does yours? If not, start the process of updating it as soon as possible.
- Direct Communication with your customers is the number one concern
Customers would much rather hear about a problem directly from you, rather than from the media or discovering the problem themselves. If you are experiencing a major service failure, let your customer base know ASAP.
Important note: Don’t maintain your only outbound customer communication list in the same infrastructure as the IT service in question. Be prepared with a redundant 3rd party email provider and other communication options.
- Be prepared to communicate over multiple channels
Don’t assume all your customers will look for updates in the same place. Employ multiple channels, including email, support sites, corporate blogs and even the corporate website to get the word out about what is occurring with the outage.
- Over communicating is better than under communicating
Nature abhors a vacuum. If you don’t provide information about the nature of the failure and what you are doing about it, your customers, bloggers and the media will fill it in for you. In general customers appreciate detailed transparent information. Make sure to include:
- The nature of the failure or outage. Include in communications when the problem began, who is affected by the problem and what the impact of the problem is on users and customers.
- Current path to resolution. Explain what the path to resolution is currently perceived to be. This may change, but let the user community know what is going on.
- Regular status updates. Update that work continues on the problem is often perceived better than radio silence.
- Contact Points. Customers need to know who to contact with questions and concerns.
- Expect the failure to become public
If your service is widely used or is back-end infrastructure that supports other providers, assume your outage will become public. Just ask Twitter, Microsoft, Amazon, RackSpace or Research In Motion. Be prepared to provide status updates and information to a wider group that just your customer base, or else that communication will be done by others who may not have a full picture of what is going on.
- Humor probably isn’t the right call
Your customers rely upon your IT service for an important business or personal function. While many organizations today build a corporate brand based on an informal humorous style, when something does go wrong it is often perceived poorly by customers. When Dreamhost implemented an update to their billing system, and overbilled a large number of their customers they communicated the problem through their trademark humorous approach. Customer reaction was resoundingly negative over the approach to the communication.
- Don’t underestimate the communication necessary after the failure is resolved
Often resolution of the outage or failure is only the beginning of the communication process. Trust needs to be rebuilt. Service fees may have to be waived. Payouts due to contractual penalties may be required. Make sure to have the resources necessary for the post outage customer satisfaction clean up, since it is often much more expensive to try to get new customers than to keep the ones you already have.
Popularity: 58% [?]
Filed Under: Business Continuity, Downtime, IT Operations











March 5th, 2008 at 4:50 pm
This is a brilliant post, Jonah. I think too many times we’re stuck without a word, and that creates bad relationships, and in the long run, harms the application/network’s viability. Thanks for sharing these ideas.
April 10th, 2008 at 6:45 pm
[...] Paransky, VP of Marketing for software infrastructure testing vendor, StackSafe, blogged 7 tips for handling post-failure [...]
April 15th, 2008 at 5:01 pm
[...] away from the GrandCentral outage. The first, as we have discussed earlier in a previous blog post, Seven Key Lessons to Keep in Mind When Communicating an IT Failure, is that effective communication is critical to minimizing the impact that downtime can have on the [...]