|QoS in the Campus|
|Friday, 29 June 2001 21:00|
IntroductionIn the previous months' articles, we talked about changes in Cisco certifications. I'd be interested in hearing from you, what you think about all this. Are you interested in CCNP? CISS? CCIP? Which ones look most interesting and useful?
[The following out-of-date content was struck through:]
This info would admittedly be useful to Mentor Technologies in figuring out what courses to offer, but also would help me know what topics to write about in coming articles. My list of personally interesting topics right now looks like: IPv6, IS-IS routing protocol, All Things Multicast (series), PIX (I've been needing to look at them more closely for a while now). I've also thought about doing a series of articles walking through what's needed for CCNA. I've personally been immersing myself in MPLS lately, but am so far successfully resisting the urge to write about nothing but MPLS. What are you interested in reading about? Do let me know!
If you are interested in the CCIP certification, Mentor is strongly considering offering the base courses for CCIP:
I've had a request to write about QoS in the Campus, something I gave short shrift to in the earlier QoS articles. That's the topic for this month's article. Prior QoS articles can be found at http://www.netcraftsmen.net/welcher/papers . The most recent QoS article is at http://www.netcraftsmen.net/welcher/papers/newqos121.html .
Why Do QoS in the Campus?I'd always mentally tuned out Campus QoS. After all, can't you just "throw bandwidth at the problem"? Bandwidth in the campus is relatively inexpenseive. Yes, you can have queues fill up in switches. But even with full queues and queuing delays, how much latency are you going to pick up? Finally, the words "Head of Line blocking" started seeming a lot more important to me.
The biggest reason for QoS in campus switches, as I see it, is that you can have bursts temporarily filling up hardware queues. Yes, this delay is extremely short lived. But while the queue is full, packets are getting dropped. If those are Voice packets, that's ungood! Thus, campus QoS is about protecting some traffic from drops by getting it into different queues. Number of queues depends on hardware.
There is another plus to doing QoS in campus switches. Right now, you can do packet marking, whereby you set IP Precedence or DSCP bits, at the WAN edge router. As the world moves to optically based higher speed networks, where you have a FastEthernet or Gigabit Ethernet link to a MAN IP Service Provider, doing this might become a choke point. It is arguably better to do it at LAN edge, to try to ensure consistent treatment to voice, video, or mission critical traffic. This leads to what I think of as the "insurance" argument for QoS. Namely:
Campus switches and high-speed LAN segments enable unicast or multicast of video and other multimedia services. People also start doing interesting and surprising things when you give them the bandwidth. Things like using Windows Explorer to drag and drop a disk drive to a server disk, as a means of backup. (Hey, how do you back up your home computer? Office desktops?) Do you want mission critical database traffic, or VoIP traffic, adversely affected because someone did something like this, something you didn't expect? Do you know every use that's being made of your network? Every file transfer people are doing? If you're not convinced, think about the impact of things like Doom and then Napster on networks.
Using QoS allows you to ensure bandwidth goes to the important things first, and that the rest of the traffic coexists on the link. It means you can go ahead and use all that bandwidth, since your important traffic will be protected by the QoS.
Principles of Campus QoSTraffic in a campus switch is either going from a port (workstation or phone) to network, or in the other direction. Some Cisco documents refer to this as user-to-fabric (Rx queue) and fabric-to-user (Tx queue) traffic. The design expectation is that a non-blocking switch architecture, together with good design, means that you won't need much queuing or buffering going from users and servers to the network. Generally this is the case, but if you're not monitoring traffic regularly, can you be sure it is the case? My concern here is oversubscription. Say you have 48 100 Mbps ports going into a switch, and one Gigabit Ethernet (GbE) trunk out to the higher level switches (assuming a hierarchical design). If those 100 Mbps ports are all going at 30% utilization, then you've got 1.6 Gbps of traffic and only a 1 Gbps trunk. Admittedly, getting 30 Mbps sustained traffic on 48 ports might take some doing in most shops to date.
The bigger concern is traffic going the other way. One situation of concern is many-to-one congestion, say where a number of users burst in demands to a server. The other is speed mismatch, where a burst for a user arrives on a GbE trunk and has to be sent out a FastEthernet port. This is where you definitely expect to need queuing and buffering.
The next question is, what might we expect the switch to do about it? In previous QoS articles, we've seen the basic QoS operations as:
Generally, inbound ports have no intelligence and either recognize or ignore CoS bits. The CoS bits are generally copied to an internal header so the information is at least available within a switch. If the outbound port is a trunk, the CoS bits are generally put into the appropriate frame header field.
Layer 3 switches can deal with various protocols. For IP traffic, the IP Precedence (0-7) or Diff Serve (DSCP, 0-63) interpretation of the ToS field in the IP header is another way of marking, discussed in previous articles. Layer 3 switches can take CoS bits and use those to set IP Precedence or DSCP bit, or vice versa, generally by means of a mapping table. For non-IP traffic, all we've got is the CoS bits to work with. Layer 3 switches will generally set CoS or other bits on the outbound port or interface. They may also set the IP Precedence or DSCP bits.
Switches with higher level QoS just use upper protocol stack layers to classify in a more precise fashion.
Another key concept is that of trust. Does the switch trust or believe any CoS or IP Precedence or DSCP bits sent to it? Note that if the PC NIC card driver is 802.1Q capable, both the CoS and IP Precedence bits can be set by an end-user's PC. Do you want your savvy users upping their IP Precedence to 5, the same as Cisco VoIP traffic? There are rumors that under Win2000, a well-known browser that experiences congestion sets IP Precedence to 5. This is another policy issue you're going to have to decide.
The Cisco IP telephones also play a role in all this. The newer phones are actually 3 port switches. Enhanced CDP and switch configuration allow us to clue the phone in on the auxiliary VLAN (voice vlan). The phones can then send VoIP traffic with 802.1Q headers indicating that VLAN, along with CoS bits set to 5, and IP Precedence 5 within the IP header. When the phones pass through traffic from a connected PC, the PC traffic can be converted to 802.1Q, labelled with the native VLAN for the port, and can have a specified CoS written to the frame. If you so choose.
If you follow this logic, switches become a very natural place to be doing QoS. Where else are you going to be able to readily distinguish your PC traffic from your VoIP traffic? Yes, routers can use Access Lists to distinguish the two, particularly if your IP phones get addresses out of a different address block. But by classifying and marking the VoIP with CoS bits, you can guarantee the VoIP traffic expediting handling through the campus LAN.
Queues are the other thing to think about with switches. What we'd like to do is use CoS bits to put traffic into different queues. Many of the newer Cisco switching blade now support at least two transmit queues. When there are two queues, one might have priority, or a form of weighted round robin (WRR) might be used. (Custom Queuing is a form of WRR, if you're familiar with it.) So if you're shopping for Cisco switch hardware QoS support, look at receive but especially transmit queues and how they behave.
The other function Cisco has been building into switch queues is queue management. This might be full Weighted Random Early Detection, WRED, described in previous articles. In WRED, the CoS or Precedence bits get used to differentially randomly start dropping packets or frames as the queues begin to fill up. The other approach that is being used in Cisco switches is multiple thresholds for tail drop. As I understand it, under this scheme, when each threshold is reached, all frames below a certain priority level will be dropped. Details do vary with switch model and blade.
I've built the following table from various Cisco information sources. Any errors are my fault. The information may be out of date (and will certainly become out of date over time), so treat it with caution. Everything I'm seeing suggests there may be more queues coming, and with more features.
Other L3 switches: look under http://www.cisco.com/univercd/cc/td/doc/product/l3sw/index.htm
Of these, only the 5000 and 6000 series QoS documentation will take much time to read.
Configuring QoS in SwitchesOut of space! This looks like it'll have to wait for another article. It is also very switch- and even blade-specific.
QoS Policy Manager is of course one way of applying QoS policy consistently across a group of Cisco devices.
Copyright (C) 2001, Peter J. Welcher