|
||||||||||||
IntroductionMany networks were severely affected by the recent computer
worm/virus outbreaks. If you were afflicted, you have my
sympathy. Now that most staff have their life back, it may be
appropriate to do "lessons learned", before the next wave hits. I'm
planning on doing a couple of articles on this topic. The topic rather naturally falls into Before and
After
categories.
Before is Best Practices and things to do before you have a
problem. After is the things you can do to facilitate repairs and
mitigate nasty
side-effects after an attack. I plan to cover this topic backwards,
with the
After part
first, since it may be helpful to some, and there seem to be more new
ideas floating around here. As far as the Before part, I'm a bit
hesitant to claim sufficient expert (ego?) status to be putting out my
own list of Best Practices. However, I do have some thoughts I haven't
seen elsewhere, I've got lists of things from various sites, and I can
certainly provide a set of links to other compendiums of Best Practices
that you may find useful. This article is going to be biased a little bit towards the
network side. That's because the network side is where I feel most of
my expertise lies. I'm pretty darn good on UNIX/Linux, and can hold my
own with Windows -- but I cannot claim to be current with what
system administrators are doing on the security front. The other
justification for any network bias: too often viruses are seen as a
systems problem, but there is a valuable role for the network team in
helping fight the virus attacks as well.
New FactorsSometimes events cause changes in how we think about things. (I'm very carefully not saying "paradigm shift".) The recent viruses were qualitatively a bit different from their predecessors in several ways:
RateConcering Rate, viruses are spreading faster and faster. This was discussed in the article How to 0wn the Internet in Your Spare Time, which can be found at the URL http://www.icir.org/vern/papers/cdc-usenix-sec02/. (Love that catchy title. Wish the predictions had taken longer to arrive.)What we're seeing now is that the viruses hit so fast and hard that many networks were effectively down, sites couldn't download patches or fixes, etc. The other impact here is that perhaps the virus scanner signature file mechanism is getting overtaken by the bad guys. If all your computers are infected before the vendors have signature updates, or before your hourly/daily refresh of host signature files, you caught the virus. The partial good news is that the vulnerabilities were known, so those who had kept up on patches weren't as badly hit. There are always home users and others who get missed in patching, which is why automation and tracking of who's got what patches is so crucial. (Note to self: stop preaching to the choir!) Cisco seems to be in the right place at the right time with the Okena acquisition. Their Host Intrusion Prevention System (HIPS), or CSA, doesn't use signatures and supposedly would have blocked these attacks. See also Cisco Security Agent, http://www.cisco.com/en/US/products/sw/secursw/ps5057/index.html. I like the idea of attempting to protect against the unknown, but there are obvious limits to how far that can effectively be done. Rate leads to another thought, one that I've now heard from
several sources: Is there a way to slow or contain the spread of the
virus? The analogy is perhaps SARS and face masks. Cisco PVLAN's (Private VLAN's) in
switches can help keep one host from infecting another. They have the
virtue of being unlikely to disrupt services, if implemented with
reasonable care. Another idea is what I'm calling network lockdown. In network lockdown, switch VACL's (VLAN Access Lists) allow end-user hosts to only talk to the server subnets (not too hard if you have server farms). You do need to be careful with this approach: what about networked printers or print servers? Other services some staff may need, such as local file servers? If printers and services are all local, then do ACL's (Access Lists) controlling inter-VLAN traffic help? I've noticed recently (see the previous DSNIFF article) that smaller VLAN's or subnets to help mitigate the impact of Layer 2 attacks, such as MAC flooding or ARP spoofing. I'm not sure any of the above are complete answers, but they may work in your environment. What I've just said can also be thought of as at least having
some internal firewalling, be it PIX or ACL or whatever. You may wish
to put some extra protection in front of the server farm even. For
quite a while we've been doing the hard exterior / soft interior
firewalling model. The security experts have been saying that's not
enough. One way to think about this: do you have to wear a badge
in the building, or is showing it to the guard at the front door
enough? Do you have security on just airport passengers, or do you also
have some internal controls on airport employees? ScaleScale comes in because so many hosts were infected. Sites needed
Universities were being hit with the virus outbreak just as
students were returning to campus. Some have been building toolsets
that lend themselves to helping with this situation. Eric Gauthier was
kind enough to type up a draft document and tell the NANOG list about
it. It can be found at http://www.roxanne.org/~eric/blaster.html.
He references a University of Connecticut writeup, which can be found
at http://www.security.uconn.edu/uconn_response.html.
Solving a slightly different problem, University of Florida
apparently has a traffic level scanning program called Icarus that does
scans for peer file sharing programs like Kazaa, and turns down student
links if usage is detected. See also http://www.mae.ufl.edu/sysinfo/uf_takes_action.htm. I'm mentioning this since enterprises may not be aware of this
approach. The enterprise equivalent is software that
verifies the user virus scanner and/or personal firewall is running
before allowing connection to the network. Zone Integrity, McAfee
e-Policy Orchestrator go part-way here. Right now, vendors (including
Cisco) are more focused on good hygiene before connecting via IPSec
VPN. Reading about Zone Integrity, I see it has features
resembling the network lockdown I mentioned above. Is that
something network devices should be doing, or is it something you want
your personal firewall software doing on each and every computer? As far as Auto-Detection of viruses, see below. There are some nice network-centric techniques that you may not have thought of. If you can figure out who has the virus (without slow scanning), then the systems folks can be more effective at virus removal. Concerning automated cleanup, that's pretty firmly on the systems side of things. People seem to use some combination of PERL scripts, Web CGI scripts, Windows SMS, virus vendor tools, etc.ImpactThe recent worms spread so fast and to so many systems that they became a Distributed Denial of Service (DDoS) attack, in effect. Since external DDoS is also a concern these days, it's good to know how to mitigate a DDoS attack. This is somewhat related to the detection of infected hosts, so we'll go into this in more detail below as well.OtherStaffing LevelsI've been noticing that staffing levels are a problem. If you're barely keeping your head above water, you don't have time for network or performance management, let alone reading IDS logs, etc. In which case, you're to some extent driving blind. An analogy comes to mind: how long will your car work without the engine oil sensor gauge? So why do you have one? If all your time is going into scrambling to keep up with new sites, swapping out gear, and resolving implementation issues, then how are you supposed to be finding time to use all that network management and security software? Yet that's what can alert you to problems as they begin to hit.Lean times and budget crunch may explain some of this. I am
beginning to see that as network devices proliferate, and as
responsibilities and complexity grow, perhaps staffing has not kept
pace. I've seen this phenomenon elsewhere, sometimes somebody notices
and fixes this, sometimes people get stressed or the network gets more
and more fragile until something breaks (or somebody quits). I'm also hearing some folks say "we know we should
have done
X, but we haven't had time". Resolving your business operations
after getting hit by a hacker, virus, or
DDoS attack can certainly also consume a good bit of time. And
you don't get the chance to schedule those
activities! Total System ApproachAnother thing I've noticed is that it may be time for systems folks and network folks to work more closely together, especially in shops where that's not normally the case. That's perhaps one conclusion to be drawn from the technical part of this article: the network devices (and design) can detect and mitigate ill effects and help control the spread of viruses. Systems administrators can use anti-viral scanners and so on, and are needed to fix the infected computers. It definitely works better if you tackle the problem from both sides. Yes, you can have personal firewall policies to control spread -- but it might be easier, less of a performance impact, and more effective to do policy enforcement on network devices. Doing so does not require the presence of cooperating or controlled agent software on each and every PC.Technical IdeasThere were two forward references above, concerning how to auto-detect viruses, and how to control their spread. I propose to provide a higher level of detail on these techniques, which means this discussion will have to extend into next month's article, for space reasons.As mentioned above, it is tempting to do some form of network
lockdown, as mentioned above. You have to be careful about cutting
users off from resources they need. The attraction of this approach,
where applicable, is that the access lists may be fairly simple
and they can be put in place in advance or at the first sign of virus
infection. So to mitigate Blaster, Cisco is recommending an inbound ACL
such as the following. We've added some entries recommended by other
sites, such as Microsoft. access-list 101 deny udp any any eq 135
access-list 101 deny tcp any any eq 135 access-list 101 deny udp any any eq 137 access-list 101 deny tcp any any eq 137 access-list 101 deny udp any any eq 138 access-list 101 deny tcp any any eq 138 access-list 101 deny udp any any eq 139 access-list 101 deny tcp any any eq 139 access-list 101 deny udp any any eq 445 access-list 101 deny tcp any any eq 445 access-list 101 deny tcp any any eq 593 access-list 101 permit ip any any This controls access to Microsoft ports 135 and 139, and to
another port used by Microsoft SMB, port 445. Microsoft is
recommending blocking port 593 as well. You can then use Class Based policing to drop all such
packets. Create a QoS class: class-map match-all dcom-rpc
Then create a policy:match access-group 101 policy-map drop-dcom-rpc Apply this inbound to the desired interface: interface fast 0/0 You will also want some outbound filtering on your firewall or edge router. The worm downloads mal-ware via TFTP. Do your users really need to be able to use TFTP to servers on the Internet? I don't think that's a good idea! Hence Cisco recommends adding to your rules something like: access-list
102 deny udp any any eq 69 Now if you're being diligent, you accumulate lists of
troublesome ports like this from various worm or virus attacks and add
to your ACL. Since
many new attacks recycle old code, this reduces any impact of such new
variants. Weeks after Blaster, our customers report they are still
seeing scans on Microsoft ports. LinksThe following are good links about the systems side of various recent worm/virus attacks. They might be useful if you're looking to write rulesets such as the above. I realize you've probably found your way to some of these by now, but it's helpful to know where you can find information and what's likely to be out there for the next time around.I have another good set of links for Cisco TAC
recommendations, etc., but I'm saving those for the next article.
Next month we'll continue the technical discussion of how to detect and
control worm/virus attacks. I'll include Cisco links in that article. |
||||