Top 4 Questions to Ask Your Outsourced Infrastructure Provider about Uptime and Availability

Posted February 12th, 2008 by Jonah Paransky

Infrastructure outsourcing is becoming a more popular option for companies of all sizes, from 100 person organizations to 100,000 employee behemoths. The infrastructure that e-mail, websites, CRM systems, and even financial processing systems run on are all being outsourced to third party infrastructure providers at a furious rate. This is because often these providers bring scale, expertise and guarantees that promise a clear ROI to the business and IT organizations that make these outsourcing decisions.

However when looking to outsource infrastructure management, companies must understand the implications to availability and uptime and weigh those implications as part of the decision process. Don’t forget that depending on the end-to-end IT service you will be offering on the infrastructure, your tolerance for downtime will change. Also, don’t think that web 2.0 startups don’t have to worry about these issues, often Web 2.0 demands more availability.

At the same time, beware of “over engineering”. Joel Spolsky makes the important point that you have to find the right balance between availability and cost for your organizations needs.

Question 1: What kinds of guarantees are provided by the infrastructure outsourcer around availability and uptime?

When uptime guarantees come into play, the devil is in the details. Downtime issues have been known to destroy vendor relationships and cause significant public pain. There are even current lawsuits on the use of “Always On” by service providers who then have unplanned downtime.

Several areas to be concerned about include:

  • What is the provider guaranteeing? Is the guaranteed uptime a measure of the connection of their facility to the internet? Of a specific infrastructure component? Or of the complete end-to-end availability of your IT service? How is the guarantee measured? Don’t expect miracles, if you can’t figure out a good way to measure the end-to-end uptime of your IT service, they probably can’t either.
  • Will the vendor report downtime to you – or are you forced to report downtime to the vendor to receive any deserved credits?
  • While you may agree on 99.9% uptime – is it measure daily? Monthly? On a 3 month sliding scale? Annually? This can be the difference between an outage lasting 1.4 minutes to an outage lasting 8.5 hours being considered within your contracted downtime allowance. As pingdom points out, there are many hosting companies out there with poor or odd uptime guarantees.
  • What about planned downtime? If your application needs to be up and running 99.9% of the time, and the vendor allows for an 8 hour maintenance window each week that doesn’t apply to your downtime statistics, you are not going to be happy with the results.
  • Do the penalties associated with the guarantees offer any real value? On the flipside, there are those that argue that there are more effective vendor control models than service level agreement (SLA) penalties.

Question 2: How does the infrastructure provider architect their infrastructure for availability?

Explore in detail the baseline architecture being used to support your end-to-end IT service. Key areas to explore include:

  • Are their single points of failure in the infrastructure design? At the network and hardware level, what levels of redundancy exist?
  • Is the infrastructure distributed across multiple locations?
  • What is the redundancy built into the physical plant? Generators, Power, connectivity, location all factor into the likely actual availability of your infrastructure.

Question 3: How does the vendor plan for Disaster Recovery and Business Continuity issues?

It is an IT reality: data centers fail. Even with multiple generators, battery backup and multiple inbound connections, failures do happen. The important point is that the outsourcer has taken this into account, by performing regular disaster recovery and business continuity testing. Questions to ask include:

Question 4: How does the outsourcer test the impact of changes prior to release into production?

Changes are the leading cause of downtime. It is critically important to understand how your proposed Outsourcer introduces change into the infrastructure environment. Issues to investigate include:

Popularity: 10% [?]

Filed Under: Business Continuity, Downtime, IT Operations



One Response to “Top 4 Questions to Ask Your Outsourced Infrastructure Provider about Uptime and Availability”

  1. IT’s About Uptime - The StackSafe Blog » Blog Archive » Case Study on the Cost of Downtime: Amazon S3 Outage Says:

    [...] Outsourcing critical infrastructure comes with risk. Make sure to architect for availability, and don’t assume that your provider will achieve 100% upt… [...]

Leave a Comment