By: William P. Flinn,
Updated
Friday, April 06, 2007 06:39 PM
(C) 2007 - Not authorized for reproduction or
sale without author's express written permission
Do You Have a Method For Testing Patches in the Enterprise?
(Updated March 2007 to
include vulnerability scanning strategies)
In light of Microsoft’s release of
over thirty patches last summer, and numerous patches
already in 2007, I figured it was time to
discuss security patch testing methodologies. There are at
least two basic schools of thought about when, how and even
why to test new patches when they are released by the
vendors. It all boils down to risk analysis. You are
weighing the risks of being hit with an attack that one of
these patches could have prevented with the risk of
potential damage that the patch itself could cause when
applied. After all, the business of business is business.
Either one of those risks could cause your network or
individual computers to be inoperative and keep your
customers from doing their work and adversely affect your
business.
A Word
About General Vendor Patch Testing:
Microsoft claims that they test
their patches before release. Companies such as PatchLink,
which offer automated patching solutions, claims to test
patches on 250 different configurations before making them
available to their subscription servers. It is assumed that
any vendor, for that matter, does quality assurance testing
before releasing their patches. What you have to keep in
mind, however, is that they have no way of duplicating your
specific environment. You may be using some software
created in-house, or other special applications that will
behave differently when these patches are applied. This is
where “to test or not to test?” becomes an important
question. Your environment may pose some very unique and
unanticipated issues when security updates are applied.
The Risk
of Patches Breaking an Application is Higher:
One school of thought regarding
patch testing is that there is not a current exploit that
one of the new patches will prevent; therefore the risk of
being attacked is low. There is always the risk that a new
patch will break an application, however. So, in-depth
testing needs to be performed before releasing the patch to
the production environment. Subscribers to this particular
patch testing theory tend to test new patches in stages.
They want to ensure that small populations of computers are
patched and monitored for abnormal behavior. Then any
issues (if discovered) are cleared up, before finally
releasing the patch to the production computing environment.
For example, this method of
patching may involve patch testing and deployment in three
phases: 1) Testing in an isolated lab environment first; 2)
Followed by deployment to a pilot group of computers in the
production environment; 3) Then deploying the patch to the
rest of the production environment. The entire process can
take a few weeks: one week of lab testing, followed by two
weeks of monitoring the pilot group, then final patch
deployment. The final deployment, depending on the size of
the organization and patch methods used, can take a few
weeks as well.
During months such as June, July,
and August of 2006, where Microsoft alone released over 30
patches, the testing phases could end up being somewhat
prolonged. The longer it takes to patch all your computers,
the higher the risk that a new WORM or other exploit will be
released and widely spread.
The Risk
of Being Exploited is Higher:
Another school of thought
regarding patch testing implies that there is a much higher
risk that the “bad guys” will come up with a widely
available exploit very soon. Even if there isn’t one in the
wild right now, there soon will be. The fear is that two,
three, or four weeks of patch testing will put the
organization at an unacceptable risk of being attacked.
Statistics also indicate that the instances of a patch
actually breaking applications are very low. Therefore, the
risk of being attacked is much higher.
The method of patch deployment
here is to simply deploy the patch to the entire production
environment right away, and clean up any issues that are
caused after they are discovered. Depending on the size of
the organization, cleaning up just one errant patch may be a
very staff-hour intensive endeavor. If several patches go
wrong, it may be a tedious and intensive process to figure
out which one actually caused the problem.
In a month where one or two new
patches are released, the risk of breaking applications may
be extremely low. During a month like we have had during
all three months in the summer of 2006 (the release of over
30 patches by Microsoft alone), this presents a
significantly higher risk.
Taking a
Hybrid Approach:
The two methods above can actually
be combined by defining various levels of urgency along with
the types of risks involved. For instance, a patch that is
released AFTER a widely available exploit (such as a WORM)
may indicate a higher level of risk for attack. In this
case, testing periods may be dramatically shortened or even
foregone entirely in order to protect the organization as
soon as possible. In this case, it might make perfect sense
to deploy the specific patch right away and clean up any
issues after the fact. Better to have the possibility of a
few machines with issues than to have the entire network
brought down by an imminently attacking WORM.
In instances where patches have
been released through normal release cycles from the vendor,
and there are no active exploits, regular testing may be
performed to reduce the risks of breaking applications. The
risk of attack is relatively low in this case. This buys
the organization time to perform quality testing on the
patches, but does not relieve the organization of the need
to apply the patch as quickly as possible to reduce
potential future risks.
This type of patching strategy
still involves due diligence and risk analysis either way,
but moves the risk analysis into a third dimension based on
how imminent current risks of attack really are. Patches
can then be tested and prioritized based on current exploits
and other security factors.
What Do
The Experts Say?
The National Institute of
Standards and Technology (NIST), in Special Publication
800-40, recommends that a Patch and Vulnerability Group (PVG)
be formed, one of the duties of which is to test patches and
other remediations on standard configurations (NIST 800-40,
pg. ES-2). NIST suggests that such a group would test the
patches on standard configurations, therefore eliminating
the need for testing by local administrators. NIST also
mentions employing configuration management and change
management principles into the patch testing and deployment
processes. This essentially means that the organization
would have a central working group tasked with ensuring that
patches are tested and deemed safe for the entire
environment.
Microsoft, in a number of their
documents, advocates the testing of patches (on servers
anyway) on a test platform that mimics the production
environment. Microsoft’s stance is that patch management is
a continually on-going cycle of determining what patches
exist, determining if your environment is applicable for the
patch, obtaining the patch, testing the patch, and deploying
the patch (How To: Implementing Patch management, 2003).
According to an article by the
Sarbanes-Oxley Compliance Journal, a patch testing program,
along with change and configuration management processes
must be in place (Ashley, 2005, Paragraph 20). Many
regulatory bodies seem to be taking the stance that testing
of patches is as vital as applying them. As seen above in
the NIST publication, which is where the federal government
takes its direction from, testing strategies are of high
importance.
Augment
Patch Testing with Industry Information:
One of the best ways you can
augment your own patch testing strategy is to find out what
others are saying about particular patches. The Patch
Management listserv at
http://www.patchmanagement .org is a great way to find
out what others are experiencing with a particular patch.
When you join the listserv, you automatically get emails
when others post to the forum. If a patch goes wrong, you
will instantly know about it because of the sudden flood of
email traffic. You can also view postings on sites such as
PatchLink (http://www.patchlink.com)
and Shavlik (http://www.shavlik.com)
to find out what is going on with patches and patch
testing.
Augment
Patch Testing with Vulnerability Scanning:
It is quite clear from
the events discussed in the two parts of this article that a
proactive strategy for patching and scanning is in order.
Such a strategy will ensure that vulnerability scanning is
built in to the patch testing process so that 1) patches
will be verified as being applied and that they do not have
adverse affects on the system, and 2) the vulnerabilities
that the patch is meant to target are actually being
remediated. Testing the patches as they are received will
ensure that they apply properly and do not break
applications. Then a follow up of deploying patches to a
pilot group will give the patches more rigorous testing in a
real environment, and allow IT staffs to clear up any
problems quickly before deploying to the full production
environment. Once this is done, a follow-up scan on those
same pilot computers will verify whether or not the applied
patch mitigated the vulnerability. If it does, then the
desired goal was achieved. If it does not, then it is time
to have an investigative process to find out if 1) the patch
is not doing its job, or 2) the scanner is alerting on a
false positive condition. This process will allow for the
discovery of scanner alert anomalies as soon as possible,
and a fix to be developed before the scanner hits the full
production environment.
It is important to
note that testing patches and developing vulnerability
remediations can be tricky in that hidden causes will
sometimes not be found right away. This was evident when
scenario 4 as described above brought to light newly
discovered problems for a situation that was thought to be
previously resolved. For this reason, it is important to
carefully choose those users who will be in the pilot group
for the second phase of patch testing. They should be
fairly computer savvy users who know how to properly respond
to error messages, and that they also know how to carefully
document any problems that they run into. This is the group
of people that will know that these errors are possibly
going to occur, and won’t fly off the handle when they do.
They will know to calmly notify their IT support staff, and
won’t panic and click through all the error messages until
the IT staff has had a chance to see them and work the
issues. So having said all that, let’s take a look at the
chronological steps that would take place in this whole
testing and investigative process.
What About
Home Computers?
Since we are talking about
enterprise patching strategies, it is also worthwhile to
discuss how home computers affect the enterprise
environment. Employees who telecommute need to be kept
up to date with current patches just as do employees in the
office. They often connect to the corporate network
through VPNs, bring their laptops into the office once in a
while and connect that way, or bring in files on thumb
drives, floppy disks, and CDs. In these situations they are
possibly presenting an un-patched computer to the corporate
network, and it is important that these computers are just
as protected as any others.
Many times, telecommuters are not
connected to the enterprise patch management system, or are
on poorly performing connections that would make patching
using the enterprise system unacceptable. For these
reasons, many times just setting Automatic Updates to
Auto/Auto is an accepted practice. These users have to
be supported as well, so it is important to formulate
roll-back and repair strategies for them. They are not as
accessible to IT support structures as are regular office
employees are, and therefore present a unique challenge when
it comes to patching and troubleshooting problems.
If you follow Microsoft’s
guidelines, and set your Automatic Updates to Auto/Auto,
meaning you will have patches automatically downloaded and
installed when they are released, you are pretty much
following the “patch now, clean up later” school of
thought. In other words, you are relying strictly on the
testing that Microsoft has performed before they released
the patches. Statistically speaking, all should go well.
But there Have been times when a home user went to their
computer only to find that it had blue-screened or had other
issues caused by a patch. When this happens, you have two
measures available to solve the problem. Booting to the
“Last Known Good Configuration” often restores the computer
to a working condition. System Restore should bring you
back to a known good point in time before the patch was
applied. And if all else fails, call Microsoft at
1-866-PCSAFETY, a free hotline Microsoft has set up to
address problems caused by patches.
A Sample
Patch Testing Plan:
If you have decided that you are
going to perform testing before you deploy, then it is a
good idea to formulate a checklist of activities. Even if
you don’t test your patches, you need to at least have a
roll-back plan in case problems result. Using a consistent
set of steps each time patches are released will help to
ensure that you are taking a standard approach, and that all
who are affected will develop the same expectations from
month to month or patch cycle to patch cycle.
An example
patch testing checklist might be as follows:
-
Prepare for upcoming
patches
-
Sign up on
patchmanagement.org listserv to keep current on
patch management information.
-
Sign up to
receive Microsoft advance notification of security
bulletins.
-
Find out as much
as possible about current exploits that affect
vulnerabilities for which a patch does not yet
exist.
- Regularly visit sites such as
SANS Internet Storm Center and
US-CERT for information about current threats.
-
Research newly
released patches
-
How many patches
are being released?
-
Are they
security, or non-security patches?
-
What software do
they affect?
-
Do any apply to
the current environment?
-
What is their
security criticality?
-
Are there
currently any exploits in the wild addressed by
these patches, and how widespread are the exploits?
-
Is there a
rollback strategy – which patches can be
uninstalled, and which ones can’t?
-
Asses the current
environment
-
What systems are
at high risk for the vulnerabilities being patched?
-
What systems will
be adversely affected by patching and reboot
schedules the most.
-
Create a plan for
when the patches can be rolled out – servers and
desktops will obviously be done using different
scheduling.
-
What change and
configuration management processes need to take
place before patches are applied?
-
Prioritize the
patches
-
Are there any
vulnerabilities that need to be patched soonest?
-
Do any of the
patches have dependencies or prerequisites?
- Perform
an initial vulnerability scan to ensure that lab
computers are clean of any vulnerabilities, remediate
as necessary before applying new patches.
-
Test patches one at a
time in the lab
-
Assess operation
of as many applications as possible.
-
Resolve any
issues and/or develop work-around strategies for any
system failures caused by patches before moving on.
- Perform
another vulnerability scan on the lab computers to ensure that the new
patches mitigated the vulnerabilities to which they were
intended to apply, investigate those that did not, and
develop a strategy for workarounds or additional
mitigations to apply to production machines later.
This is the time to start the investigation to find the
fixes and identify false positives.
-
Test patch
deployments to a pilot group in the production
environment.
-
Try to target the
more “computer savvy” users who are close at hand,
know how to give good feedback, and can be fixed
quickly if they have issues.
-
Ensure enough
time to evaluate the operation of as many
applications as possible (3 - 7 days).
-
Resolve any
issues and/or develop work-around strategies for any
system failures caused by patches before moving on.
- Perform
a vulnerability scan against a random sampling of this
pilot group.
-
Determine if any additional remediation procedures are
required to clear up vulnerability issues.
-
Deploy to the
production environment.
-
Make a final
assessment of patch operation.
-
Record any notes
related to issues and work-around solutions.
- Do the vulnerability scan
of the full production environment and forward the
results to the IT staff for final remediation.
Wrapping
It All Up:
Whether you patch or not, whether
you test or not, there are risks involved either way.
Whether you are willing to risk being attacked or risk
breaking computers, either way you are risking an
interruption in business. Each organization is different
and has different needs. The key is to evaluate those
needs, evaluate the level of risk that can be tolerated, and
formulate a patching strategy based on that evaluation.
Keep in mind that patching also affects your configuration,
and therefore affects configuration and change management.
As the experts strongly suggest, having change and
configuration management programs in place will tie into the
patch testing and deployment program as well.
Whatever you do, have some sort of
plan in place. If you are going to test patches before you
apply them, then be sure to do a thorough risk assessment,
know your environment, and try to anticipate the sense of
urgency you will need for getting the final production
deployments completed. If you are not going to test before
you deploy, then at least have a roll-back strategy in place
in case something goes wrong.
Related
Information:
-
Ashley, M., 2005,
“VISA PCI Best Practices For All Organizations,”
Sarbanes-Oxley Compliance Journal,
Full Document Here
-
Microsoft:
Streamlining Patch Management: The Checklist
-
Microsoft, 2003,
“Hot To: Implement Patch Management”
Full Document Here
-
NIST Special
Publication 800-40 Version 2 (2005), “Creating a Patch
and Vulnerability Management Program”
Full Document Here
-
PatchLink:
Discussion Forum
-
PatchManagement.Org:
Patch Management Discussion Listserv
-
Posey, B.M., 2005,
“Step-by-Step Guide: Patch management must-do list”
Full Document Here
-
SearchWindowsSecurity.Com:
Checklist: Measuring patch management metrics
-
SearchWindowsSecurity.Com:
Patch Management Best Practices White Paper
-
Shavlik:
Patch Management