|RMON2 and NetScout|
|Sunday, 01 February 1998 21:00|
It has been a while since we’ve had a network management article. I’ve worked with the NetScout software and probes at a couple of sites, and have been reading about them. (I’ve been using this column as an excuse to read up on new technology -- or is it the other way around?) Anyway, that's partly why this month’s article is about the NetScout products.
The Cisco connection here is that Cisco resells the NetScout products. The "mini-RMON" or "RMON version 1" embedded in Cisco routers and switches is a port of the NetScout probe code. So you already have NetScout agents in place!
This also seems like a timely topic. Switched networks are certainly proliferating. There is increasing need to manage them and see just where the traffic is going, what it is, and how much of it there is. RMON2 is the natural way to do this. HP NetMetrix and NetScout probes apparently have large market share (with Bay and 3Com also embedding RMON agents in their product lines). And by using RMON2, any vendor’s probes can (allegedly) be managed by another’s software.
There's another reason it is timely:
Mentor Technologies has just become NetScout’s Training Partner, and is planning on offering the NetScout training classes, starting sometime around when you read this article (see our Web page for times and places).
We plan to initially teach the course at NetScout’s Chelmsford, Mass. facility, but to then increase the number of locations the training is available in. I personally won’t be doing the training. I wish I could, but I can't spend 100% of my time doing network management. Another Mentor Technologies CCIE, David Yarashus, may well be our initial instructor for this. (And thanks for the revisions in this article, David!) David has extensive network management experience, including products such as NetScout, Netsys, and the Concord reporting tool.
Design Aspects of RMON2I wrote an article about basic RMON about a year ago. See my RMON article, where I talk about basic RMON a bit and then enable it the hard way. (The easy way: use the NetScout software).
Let's briefly review. RMON is an SNMP MIB typically implemented in devices called agents. The purpose is to collect direct information about all frames on a LAN segment. Although there is no WAN RMON standard, NetScout and other vendors have proprietary RMON-like approaches to monitoring WAN traffic.
RMON is necessary because routers do not normally see all packets on LAN segments. The counters in the MIB-II ifTable only count traffic received by the router, either addressed to the router or transit traffic. Many vendors and tools indiscriminately calculate utilization percentage for WAN and LAN. Not only should different formulas be used (since WAN is full duplex), but there should be a warning on the displayed LAN utilization. The warning should say something like "Caution: this represents transit and router traffic, but there may also be unseen local traffic on the wire." There are darn few network management tools that actually document how they calculate utilization, and fewer that mention this caution for new network managers. (Mentor Technologies has seen some network managers telling users they don't have a LAN traffic problem, based on displays on their central management console. Then they finally lug the Network General Sniffer over, and find a large discrepancy!)
RMON version 1 was MAC layer only. RMON version 2 takes us up to the application layer, so we can track hosts and conversations by application. Unfortunately, RMON2 can also require a good bit of CPU. Cisco routers have included partial RMON version 1 ("mini-RMON") since IOS 11.1(6) or so. This includes the statistics, short- and long-term history, and alarm and event capabilities. Cisco switches have also included these groups, although the implementation of the alarm and event groups apparently had some bugs in Catalyst releases 2.2 and 2.3.
The reason Cisco has done this is to allow us to collect crude statistics on Ethernet segments, especially for low-level diagnosis situations such as traffic bursts, excessive CRC errors, or broadcast storms. Probes do cost money, and there are a very large number of LAN segments in a typical company!
More costly RMON2 probes should be concentrated on backbone links. Of course, these links are typically high speed, calling for a more powerful and more costly RMON2 probe. But such probes are then in a position to see almost all traffic of interest. The same applies to WAN or Frame Relay probes in star topology networks. They see it all! With some backbones now being switch VLAN trunk links, it helps to have a probe that understands the VLAN tags. The NetScout probes understand the Cisco ISL encapsulation.
Having a small number of probes in key positions not only lowers purchase cost, but makes it easier to manage the network. You don't have to talk to a large number of devices to see what's going on.
If you have the money and time, you can also put more probes on switch SPAN ports, to save visiting the switch to install a probe when trouble arises. Obviously, key segments such as the server farm segments (VLAN's) should come first.
By the way, Netsys is capable of collecting data from NetScout probes (also Bay hubs). It can load HP NetMetrix probe data you've collected into a directory. The RMON2 data is particularly valuable in that Netsys can then analyze traffic flows seen by that backbone probe. It can report which WAN and other links are most heavily utilized by the observed traffic.
NetScout OverviewNetScout Systems used to be called Frontier. They changed their name because there is a Frontier Software and a Frontier telephony company – too much name confusion. NetScout manufactures RMON2 hardware probes. They also sell RMON2 software to extract and report data from the probes.
(Homework for the reader: hit the Web and find out how many companies named Frontier there are).
NetScout ProbesI’ve been trying to keep an eye on the RMON / RMON2 market for a while, on and off. NetScout seems to have a small edge as far as time-to-market on high-speed probes.
NetScout sells (as OEM) to both DEC and Cabletron as well as Cisco. They seem to have kept their advertising budget relatively small. The reason I’m mentioning this is that I’ve twice now thought they didn’t have a product in a certain niche (e.g. Frame Relay / WAN probe), only to find out that they in fact do have such a product.
Quick overview of the NetScout probe line:
According to the spec sheets, the NetScout probe products use a high performance real time operating system. The hardware is typically an Intel 80x86 with AT/PCI bus. The software is referred to as an RMON agent. (For more information and some hardcore RMON, see my earlier article).
Resource Monitor is agent software and licensing that allows the NetScout RMON probes to act as SNMP proxies. That means they can perform ping and SNMP polling on segments. This is leveraged by using the RMON thresholding capability to generate traps when errors arise. It saves bandwidth compared to centralized polling: one may poll to ensure connectivity to the probe but then assume the probe will poll and report other problems.
All the NetScout probes include out-of-band SLIP support. (Lack of PPP and TFTP capabilities was a bit of a surprise at first. But these are not Cisco IOS-based products!)
One bit of advice based on real-world experience. When using the console port to set up a NetScout probe, make sure it is not connected to the network. We saw some unexpected behavior when we ignored this warning (like failure to retain settings when net configuring a probe that booted off the net via BOOTP). Other than that, the menus are very simple and easy to drive.
NetScout SoftwareThe primary software product is named NetScout Manager (NSM). It runs on UNIX (Sun Solaris, HP HPUX, IBM AIX), and on Microsoft Windows 95 / Windows NT. It includes a integrated SQL database. It also includes a large number of graphical and reporting tools.
The Traffic Monitor program within NSM displays aggregate traffic information about LAN or WAN segments. Hot-links then allow drill-down to network hosts, top users, conversations. You can also view short- and long-term historical reports, level of broadcast activity, and link utilization percentage. You can also group information from several agents or probes.
The NSM Protocol Monitor program shows similar information with finer detail, down to the level of application traffic (with RMON2 probes). It lets you view protocol or application mix.
NSM also allows simple setup of traffic capture with filters on the probes. The Protocol Decode program then displays the data. There are certainly limits to RMON packet capture, unless you have bandwidth to burn in transporting copies of packets back to your management station. One also might wish for less garish colors in the packet decode window!
I can't give a comparison to output from a Network General Sniffer, not having spent considerable time with the two side-by-side. NSM does have a RMON to Network General file converter, so you can use probes to collect packets and Sniffer software to analyze them).
Another plus for NSM is that you can use the same software package to troubleshoot LAN and WAN links, whereas with most of the other WAN probe vendors, proprietary software is needed. That's pretty handy: the common interface covers all the media types, and you don't have to upgrade the manager software when adding a probe for a new media type.
Manager includes the Trend Reporter tool. This produces reports automatically.
NetScout documentation refers to "domains", which is a NetScout-specific term. Domains are just protocol levels, such as IP, or TCP and UDP, or applications like WWW, SMTP, FTP, etc. You have to activate domains to collect data on them in the RMON agent. NetScout allows you to create custom domains, to isolate or fine-tune what the probe looks at. A custom domain might be something like a specific TCP port. Once you've set this up, you can monitor it and generate reports on it. Many domains can be active at the same time.
Cisco routers and switches do "RMON" domain only, meaning RMON version 1 MAC layer information. Activating RMON (" rmon native" or "rmon promiscuous ") on the +RMON IOS builds for routers allows you to fully activate RMON domain collection. The former command counts RMON information for packets received by or transiting the router. The latter puts the router interface into promiscuous mode and counts everything on the wire. Be aware this will use some CPU. But without it you're only looking at utilization and protocol mix for traffic to / through the router, which is available via other tools.
Each switched port in a Catalyst switch is a mini-RMON agent. Be aware there is a 48 port module which has switched groups of 12 ports -- logically it looks like 4 switched ports connected to hublets. You used to have to set up a NetScout agent for each switched port. This required repeatedly entering the ifIndex of the various ports (going up by 12 for the switched blocks of 12 ports). Now, if you give the NetScout software the address of the superviser card, it discovers the agent ports (by sweeping through ifIndex values). The NetScout Manager software then treats these ports as a group of agents.
NSM does similarly for Frame Relay, in that you can set up agents for up to 256 PVC's, automatically for all or manually for specified DLCI's. And if you have more than 256 PVC's on one link, you have a real design problem anyway!
This is reasonably slick, although that can be a large collection of of agents to have to collect data from. That's a roundabout way of saying, it can get slow, so you'll want a fast machine running NSM if you have several Catalysts with lots of ports. Luckily, Sun's new Ultras are cheap ($4000) and very fast.
Traffic Director / NetScout version 4.1.3 had some limitations. At one site, it did just about everything we wanted it to on a single switch or probe. But we wanted to automate data collection from all agents: 96 ports on each of 30+ Catalyst 5000's, plus one probe per switch. We had an Ultra 2 with 1 GB RAM, 16 GB of disk and an ATM backbone, so performance and bandwidth were not likely to be issues. It seemed a lot simpler to squirrel away data than to try to trigger data collection when an RMON or other event occurred. The version of software we were using limited us to 400 agents/domains of data collection. (I.e. each domain in each agent is one collection, and there's an apparent maximum of 400 collections). It looks like the latest version (5.0) may not have this limitation, but we have not been in a position to test it.
NetScout Manager Plus adds Frame Relay and LAN Switch support. The latter now automatically discovers agents on Cisco Catalyst switches. It also supports "steering" to copy selected ports to the SPAN port with attached NetScout probe. The latter is neat! You tell NSM which port is the SPAN port, and then NSM will automatically steer traffic there to get you the data you're asking for if it isn't available from the embedded mini-RMON (using an SNMP set).
The Ciscoworks for Switched Networks product includes Traffic Director, which is a Cisco version of the NSM Plus product (with a little version lag, but Cisco support).
NetScout also sells Expert Visualizer and WebCast. Visualizer is a 3-D diagnostics tool.
WebCast uses forms to define Web-accessible reports (views into the database). WebCast pre-calculates reports to speed up viewing, with automatic daily, weekly, or monthly updates. It can also create reports on demand.
The marketing claim I like best is that WebCast requires no SQL experience. All too often, I’ve seen SQL as a real barrier to network staff getting reports out of SNMP polling tools that store data into a SQL database.
WebCast has security features, so you can control who sees what data. Too many recent network management products, especially Web-based ones, overlook security.
But Web-based access is also inexpensive. You need only buy the console software to conduct polling and store data in the database. Instead of one copy of the software per network management staffer, perhaps with overlapping polling and network traffic, you only need the one central copy performing polling. One copy of WebCast then prepares HTML reports for all to view, and provides the Web server giving access. That is:
LinksNetScout appears to be re-organizing their Web page as I write this article. So I'll just give the obvious main URL: