Home   Humor   Bulletins   Self Help   News Feed   Advanced Tips   Remote Assist   Web Links

~ Security Patch Testing in the Enterprise ~

Page copy protected against web site content infringement by Copyscape

By:  William P. Flinn, Updated Friday, April 06, 2007 06:39 PM
(C) 2007 - Not authorized for reproduction or sale without author's express written permission

 

Do You Have a Method For Testing Patches in the Enterprise?
(Updated March 2007 to include vulnerability scanning strategies)

In light of Microsoft’s release of over thirty patches last summer, and numerous patches already in 2007, I figured it was time to discuss security patch testing methodologies.  There are at least two basic schools of thought about when, how and even why to test new patches when they are released by the vendors.  It all boils down to risk analysis.  You are weighing the risks of being hit with an attack that one of these patches could have prevented with the risk of potential damage that the patch itself could cause when applied.  After all, the business of business is business.  Either one of those risks could cause your network or individual computers to be inoperative and keep your customers from doing their work and adversely affect your business.

 

A Word About General Vendor Patch Testing:

Microsoft claims that they test their patches before release.  Companies such as PatchLink, which offer automated patching solutions, claims to test patches on 250 different configurations before making them available to their subscription servers.  It is assumed that any vendor, for that matter, does quality assurance testing before releasing their patches.  What you have to keep in mind, however, is that they have no way of duplicating your specific environment.  You may be using some software created in-house, or other special applications that will behave differently when these patches are applied.  This is where “to test or not to test?” becomes an important question.  Your environment may pose some very unique and unanticipated issues when security updates are applied.

  

The Risk of Patches Breaking an Application is Higher:

One school of thought regarding patch testing is that there is not a current exploit that one of the new patches will prevent; therefore the risk of being attacked is low.  There is always the risk that a new patch will break an application, however.  So, in-depth testing needs to be performed before releasing the patch to the production environment.  Subscribers to this particular patch testing theory tend to test new patches in stages.  They want to ensure that small populations of computers are patched and monitored for abnormal behavior.  Then any issues (if discovered) are cleared up, before finally releasing the patch to the production computing environment.

For example, this method of patching may involve patch testing and deployment in three phases: 1) Testing in an isolated lab environment first; 2) Followed by deployment to a pilot group of computers in the production environment; 3) Then deploying the patch to the rest of the production environment.  The entire process can take a few weeks:  one week of lab testing, followed by two weeks of monitoring the pilot group, then final patch deployment.  The final deployment, depending on the size of the organization and patch methods used, can take a few weeks as well.

During months such as June, July, and August of 2006, where Microsoft alone released over 30 patches, the testing phases could end up being somewhat prolonged.  The longer it takes to patch all your computers, the higher the risk that a new WORM or other exploit will be released and widely spread.

 

The Risk of Being Exploited is Higher:

Another school of thought regarding patch testing implies that there is a much higher risk that the “bad guys” will come up with a widely available exploit very soon.  Even if there isn’t one in the wild right now, there soon will be.  The fear is that two, three, or four weeks of patch testing will put the organization at an unacceptable risk of being attacked.  Statistics also indicate that the instances of a patch actually breaking applications are very low.  Therefore, the risk of being attacked is much higher.

The method of patch deployment here is to simply deploy the patch to the entire production environment right away, and clean up any issues that are caused after they are discovered.  Depending on the size of the organization, cleaning up just one errant patch may be a very staff-hour intensive endeavor.  If several patches go wrong, it may be a tedious and intensive process to figure out which one actually caused the problem.

In a month where one or two new patches are released, the risk of breaking applications may be extremely low.  During a month like we have had during all three months in the summer of 2006 (the release of over 30 patches by Microsoft alone), this presents a significantly higher risk.

  

Taking a Hybrid Approach:

The two methods above can actually be combined by defining various levels of urgency along with the types of risks involved.  For instance, a patch that is released AFTER a widely available exploit (such as a WORM) may indicate a higher level of risk for attack.  In this case, testing periods may be dramatically shortened or even foregone entirely in order to protect the organization as soon as possible.  In this case, it might make perfect sense to deploy the specific patch right away and clean up any issues after the fact.  Better to have the possibility of a few machines with issues than to have the entire network brought down by an imminently attacking WORM.

In instances where patches have been released through normal release cycles from the vendor, and there are no active exploits, regular testing may be performed to reduce the risks of breaking applications.  The risk of attack is relatively low in this case.  This buys the organization time to perform quality testing on the patches, but does not relieve the organization of the need to apply the patch as quickly as possible to reduce potential future risks.

This type of patching strategy still involves due diligence and risk analysis either way, but moves the risk analysis into a third dimension based on how imminent current risks of attack really are.  Patches can then be tested and prioritized based on current exploits and other security factors.

  

What Do The Experts Say?

The National Institute of Standards and Technology (NIST), in Special Publication 800-40, recommends that a Patch and Vulnerability Group (PVG) be formed, one of the duties of which is to test patches and other remediations on standard configurations (NIST 800-40, pg. ES-2).  NIST suggests that such a group would test the patches on standard configurations, therefore eliminating the need for testing by local administrators.  NIST also mentions employing configuration management and change management principles into the patch testing and deployment processes.  This essentially means that the organization would have a central working group tasked with ensuring that patches are tested and deemed safe for the entire environment. 

Microsoft, in a number of their documents, advocates the testing of patches (on servers anyway) on a test platform that mimics the production environment.  Microsoft’s stance is that patch management is a continually on-going cycle of determining what patches exist, determining if your environment is applicable for the patch, obtaining the patch, testing the patch, and deploying the patch (How To:  Implementing Patch management, 2003). 

According to an article by the Sarbanes-Oxley Compliance Journal, a patch testing program, along with change and configuration management processes must be in place (Ashley, 2005, Paragraph 20).  Many regulatory bodies seem to be taking the stance that testing of patches is as vital as applying them.  As seen above in the NIST publication, which is where the federal government takes its direction from, testing strategies are of high importance. 

 

Augment Patch Testing with Industry Information: 

One of the best ways you can augment your own patch testing strategy is to find out what others are saying about particular patches.  The Patch Management listserv at http://www.patchmanagement .org is a great way to find out what others are experiencing with a particular patch.  When you join the listserv, you automatically get emails when others post to the forum.  If a patch goes wrong, you will instantly know about it because of the sudden flood of email traffic.  You can also view postings on sites such as PatchLink (http://www.patchlink.com) and Shavlik (http://www.shavlik.com) to find out what is going on with patches and patch testing. 

 

Augment Patch Testing with Vulnerability Scanning:

It is quite clear from the events discussed in the two parts of this article that a proactive strategy for patching and scanning is in order.  Such a strategy will ensure that vulnerability scanning is built in to the patch testing process so that 1) patches will be verified as being applied and that they do not have adverse affects on the system, and 2) the vulnerabilities that the patch is meant to target are actually being remediated.  Testing the patches as they are received will ensure that they apply properly and do not break applications.  Then a follow up of deploying patches to a pilot group will give the patches more rigorous testing in a real environment, and allow IT staffs to clear up any problems quickly before deploying to the full production environment.  Once this is done, a follow-up scan on those same pilot computers will verify whether or not the applied patch mitigated the vulnerability.  If it does, then the desired goal was achieved.  If it does not, then it is time to have an investigative process to find out if 1) the patch is not doing its job, or 2) the scanner is alerting on a false positive condition.  This process will allow for the discovery of scanner alert anomalies as soon as possible, and a fix to be developed before the scanner hits the full production environment.

 It is important to note that testing patches and developing vulnerability remediations can be tricky in that hidden causes will sometimes not be found right away.  This was evident when scenario 4 as described above brought to light newly discovered problems for a situation that was thought to be previously resolved.  For this reason, it is important to carefully choose those users who will be in the pilot group for the second phase of patch testing.  They should be fairly computer savvy users who know how to properly respond to error messages, and that they also know how to carefully document any problems that they run into.  This is the group of people that will know that these errors are possibly going to occur, and won’t fly off the handle when they do.  They will know to calmly notify their IT support staff, and won’t panic and click through all the error messages until the IT staff has had a chance to see them and work the issues.  So having said all that, let’s take a look at the chronological steps that would take place in this whole testing and investigative process.


What About Home Computers?
 

Since we are talking about enterprise patching strategies, it is also worthwhile to discuss how home computers affect the enterprise environment.  Employees who telecommute need to be kept up to date with current patches just as do employees in the office.  They often connect to the corporate network through VPNs, bring their laptops into the office once in a while and connect that way, or bring in files on thumb drives, floppy disks, and CDs. In these situations they are possibly presenting an un-patched computer to the corporate network, and it is important that these computers are just as protected as any others. 

Many times, telecommuters are not connected to the enterprise patch management system, or are on poorly performing connections that would make patching using the enterprise system unacceptable.  For these reasons, many times just setting Automatic Updates to Auto/Auto is an accepted practice.  These users have to be supported as well, so it is important to formulate roll-back and repair strategies for them.  They are not as accessible to IT support structures as are regular office employees are, and therefore present a unique challenge when it comes to patching and troubleshooting problems.

If you follow Microsoft’s guidelines, and set your Automatic Updates to Auto/Auto, meaning you will have patches automatically downloaded and installed when they are released, you are pretty much following the “patch now, clean up later” school of thought.  In other words, you are relying strictly on the testing that Microsoft has performed before they released the patches.  Statistically speaking, all should go well.  But there Have been times when a home user went to their computer only to find that it had blue-screened or had other issues caused by a patch.  When this happens, you have two measures available to solve the problem.  Booting to the “Last Known Good Configuration” often restores the computer to a working condition.  System Restore should bring you back to a known good point in time before the patch was applied.  And if all else fails, call Microsoft at 1-866-PCSAFETY, a free hotline Microsoft has set up to address problems caused by patches.

 

A Sample Patch Testing Plan: 

If you have decided that you are going to perform testing before you deploy, then it is a good idea to formulate a checklist of activities.  Even if you don’t test your patches, you need to at least have a roll-back plan in case problems result.  Using a consistent set of steps each time patches are released will help to ensure that you are taking a standard approach, and that all who are affected will develop the same expectations from month to month or patch cycle to patch cycle. 

An example patch testing checklist might be as follows: 

  1. Prepare for upcoming patches
    1. Sign up on patchmanagement.org listserv to keep current on patch management information.
    2. Sign up to receive Microsoft advance notification of security bulletins.
    3. Find out as much as possible about current exploits that affect vulnerabilities for which a patch does not yet exist.
    4. Regularly visit sites such as SANS Internet Storm Center and US-CERT for information about current threats.
  1. Research newly released patches
    1. How many patches are being released?
    2. Are they security, or non-security patches?
    3. What software do they affect?
    4. Do any apply to the current environment?
    5. What is their security criticality?
    6. Are there currently any exploits in the wild addressed by these patches, and how widespread are the exploits?
    7. Is there a rollback strategy – which patches can be uninstalled, and which ones can’t?
  1. Asses the current environment
    1. What systems are at high risk for the vulnerabilities being patched?
    2. What systems will be adversely affected by patching and reboot schedules the most.
    3. Create a plan for when the patches can be rolled out – servers and desktops will obviously be done using different scheduling.
    4. What change and configuration management processes need to take place before patches are applied?
  1. Prioritize the patches
    1. Are there any vulnerabilities that need to be patched soonest?
    2. Do any of the patches have dependencies or prerequisites?
  1. Perform an initial vulnerability scan to ensure that lab computers are clean of any vulnerabilities, remediate as necessary before applying new patches.
  2. Test patches one at a time in the lab
    1. Assess operation of as many applications as possible.
    2. Resolve any issues and/or develop work-around strategies for any system failures caused by patches before moving on.
  1. Perform another vulnerability scan on the lab computers to ensure that the new patches mitigated the vulnerabilities to which they were intended to apply, investigate those that did not, and develop a strategy for workarounds or additional mitigations to apply to production machines later.  This is the time to start the investigation to find the fixes and identify false positives.
  2. Test patch deployments to a pilot group in the production environment.
    1. Try to target the more “computer savvy” users who are close at hand, know how to give good feedback, and can be fixed quickly if they have issues.
    2. Ensure enough time to evaluate the operation of as many applications as possible (3 - 7 days).
    3. Resolve any issues and/or develop work-around strategies for any system failures caused by patches before moving on.
  1. Perform a vulnerability scan against a random sampling of this pilot group.
  2. Determine if any additional remediation procedures are required to clear up vulnerability issues.
  3. Deploy to the production environment.
    1. Make a final assessment of patch operation.
    2. Record any notes related to issues and work-around solutions.
  4. Do the vulnerability scan of the full production environment and forward the results to the IT staff for final remediation.

     

 Wrapping It All Up:

Whether you patch or not, whether you test or not, there are risks involved either way.  Whether you are willing to risk being attacked or risk breaking computers, either way you are risking an interruption in business.  Each organization is different and has different needs.  The key is to evaluate those needs, evaluate the level of risk that can be tolerated, and formulate a patching strategy based on that evaluation.  Keep in mind that patching also affects your configuration, and therefore affects configuration and change management.  As the experts strongly suggest, having change and configuration management programs in place will tie into the patch testing and deployment program as well. 

Whatever you do, have some sort of plan in place.  If you are going to test patches before you apply them, then be sure to do a thorough risk assessment, know your environment, and try to anticipate the sense of urgency you will need for getting the final production deployments completed.  If you are not going to test before you deploy, then at least have a roll-back strategy in place in case something goes wrong.

 

Related Information:

 

[Back to Top]

 

###

 

 

 

 

This World Wide Web Site Designed
By
William P. Flinn
Fort Collins, Colorado

 

 

Gonzo's Garage  ©2007