No More "Patch and Pray"

Posted January 31st, 2008 by Joe Pendry

In previous posts, we have talked about how changes impact downtime. IT Operations teams are caught up in the pull between maintaining the agility of their systems and ensuring the availability of those same systems. As the folks at DEMO observed while we were at the show:

IT organizations face a Hobson’s choice: be adaptive to changing needs of their user base or ensure that the enterprise computing environment is free from disruption. Change the software environment and put the business critical applications at risk. Opt for stability over flexibility and end user productivity may be lost.

Unfortunately, as we have observed, 27% of the time, organizations simply choose to put a change into the live environment without any testing to make sure the change works as intended. This approach is referred to as “patch and pray” and it is a scary thought that this is utilized so often.

In fact, we recently spoke with a number of IT Operations professionals who discussed this problem at great length. Here is a particularly sobering quote we recorded from an unnamed health care CIO who is struggling with this problem.

“We have a core technology team that does a lot of testing at corporate, but there are so many sites in the hospital…some of them have other applications… they may not get tested.”

Not sure I want to have my electronic medical records sent to those hospitals right before I need knee surgery.

So, a number of changes are getting through and a number of these changes are getting implemented using the patch and pray approach. But why?

First of all, many organizations over-rely on the software development teams because these teams usually have a strong history of testing. Unlike IT Operations, software development organizations make significant investment in processes and tools for the software development life cycle. This testing, however, tends to focus on the release of new applications and changes to existing applications. Software development teams do not typically touch software infrastructure changes––a responsibility that is typically held by IT Operations groups.

Secondly, IT Operations often lacks the resources to build and maintain representative testing environments. These environments are usually very expensive––especially for end-to-end, multi-tier environments running distributed software stacks (i.e., database tiers, Web tiers and application tiers), large numbers of servers and complex, interdependent software components. And if they are built, they can quickly become out of date. “Under the radar” production changes quickly create a mismatch between the staging environment and the actual production environment.

The result? As Stephen Elliot, Research Director for IDC has said:

“More than 80 percent of business-critical service disruptions can be attributed to poor change control processes, including flawed change impact assessment.”

So we are back to the “patch and pray” problem. The urgency of certain changes along with the cost of testing often forces IT Operations teams to bypass testing altogether and deploy changes directly into the production software infrastructure. And, unfortunately, patch and pray approaches increase the risk that the change will result in production downtime.

The patch and pray approach is well beyond its time in an environment where businesses demand 100% uptime. In an earlier post, Jonah Paransky wondered why anyone would ask if six days of downtime is too much. My questions is, when will we see the end of patch and pray?

Popularity: 4% [?]

Filed Under: Downtime, Testing


Leave a Comment