QoS in the Campus
In the previous months' articles, we talked about changes in Cisco certifications. I'd be interested in hearing from you, what you think about all this. Are you interested in CCNP? CISS? CCIP? Which ones look most interesting and useful?
[The following out-of-date content was struck through:]
This info would admittedly be useful to Mentor Technologies in figuring out what courses to offer, but also would help me know what topics to write about in coming articles. My list of personally interesting topics right now looks like: IPv6, IS-IS routing protocol, All Things Multicast (series), PIX (I've been needing to look at them more closely for a while now). I've also thought about doing a series of articles walking through what's needed for CCNA. I've personally been immersing myself in MPLS lately, but am so far successfully resisting the urge to write about nothing but MPLS. What are you interested in reading about? Do let me know!
If you are interested in the CCIP certification, Mentor is strongly considering offering the base courses for CCIP:
Watch our web pages for more info and class schedules!
- Routing, BSCI (IS-IS added to BSCN)
- IP multicast, MCAST
- Deploying QoS, DQOS
- Implementing MPLS, MPLS (not to be confused with the course, Implementing MPLS for BGX and MGX Networks)
I've had a request to write about QoS in the Campus, something I gave short shrift to in the earlier QoS articles. That's the topic for this month's article. Prior QoS articles can be found at http://www.netcraftsmen.net/welcher/papers . The most recent QoS article is at http://www.netcraftsmen.net/welcher/papers/newqos121.html .
Why Do QoS in the Campus?
I'd always mentally tuned out Campus QoS. After all, can't you just "throw bandwidth at the problem"? Bandwidth in the campus is relatively inexpenseive. Yes, you can have queues fill up in switches. But even with full queues and queuing delays, how much latency are you going to pick up? Finally, the words "Head of Line blocking" started seeming a lot more important to me.
The biggest reason for QoS in campus switches, as I see it, is that you can have bursts temporarily filling up hardware queues. Yes, this delay is extremely short lived. But while the queue is full, packets are getting dropped. If those are Voice packets, that's ungood! Thus, campus QoS is about protecting some traffic from drops by getting it into different queues. Number of queues depends on hardware.
There is another plus to doing QoS in campus switches. Right now, you can do packet marking, whereby you set IP Precedence or DSCP bits, at the WAN edge router. As the world moves to optically based higher speed networks, where you have a FastEthernet or Gigabit Ethernet link to a MAN IP Service Provider, doing this might become a choke point. It is arguably better to do it at LAN edge, to try to ensure consistent treatment to voice, video, or mission critical traffic. This leads to what I think of as the "insurance" argument for QoS. Namely:
Campus switches and high-speed LAN segments enable unicast or multicast of video and other multimedia services. People also start doing interesting and surprising things when you give them the bandwidth. Things like using Windows Explorer to drag and drop a disk drive to a server disk, as a means of backup. (Hey, how do you back up your home computer? Office desktops?) Do you want mission critical database traffic, or VoIP traffic, adversely affected because someone did something like this, something you didn't expect? Do you know every use that's being made of your network? Every file transfer people are doing? If you're not convinced, think about the impact of things like Doom and then Napster on networks.
Using QoS allows you to ensure bandwidth goes to the important things first, and that the rest of the traffic coexists on the link. It means you can go ahead and use all that bandwidth, since your important traffic will be protected by the QoS.
Principles of Campus QoS
Traffic in a campus switch is either going from a port (workstation or phone) to network, or in the other direction. Some Cisco documents refer to this as user-to-fabric (Rx queue) and fabric-to-user (Tx queue) traffic. The design expectation is that a non-blocking switch architecture, together with good design, means that you won't need much queuing or buffering going from users and servers to the network. Generally this is the case, but if you're not monitoring traffic regularly, can you be sure it is the case? My concern here is oversubscription. Say you have 48 100 Mbps ports going into a switch, and one Gigabit Ethernet (GbE) trunk out to the higher level switches (assuming a hierarchical design). If those 100 Mbps ports are all going at 30% utilization, then you've got 1.6 Gbps of traffic and only a 1 Gbps trunk. Admittedly, getting 30 Mbps sustained traffic on 48 ports might take some doing in most shops to date.
The bigger concern is traffic going the other way. One situation of concern is many-to-one congestion, say where a number of users burst in demands to a server. The other is speed mismatch, where a burst for a user arrives on a GbE trunk and has to be sent out a FastEthernet port. This is where you definitely expect to need queuing and buffering.
The next question is, what might we expect the switch to do about it? In previous QoS articles, we've seen the basic QoS operations as:
- Classify and Mark
- Police and Shape
- Queuing and Scheduling
- Congestion Avoidance and Queue Management
Now if you've got a Layer 2 switch, all it knows about is Layer 2 frames and MAC addresses. So classifying has to be based on port or MAC address. And marking, well, where in a frame can we put some QoS bits? The answer is, we can't, unless we're doing ISL or 802.1Q trunking. Both ISL and 802.1Q/P allow for 3 bits to be stored in the Layer 2 header. The Cisco documentation refers to these as the CoS (Class of Service) bits
. I personally found this a bit confusing until I caught on to this convention: when reading Cisco switch documentation, CoS means the Layer 2 frame header bits
. Unless the port is trunking, there are no CoS bits. I'll spare you frame diagrams (see the Cisco white papers below
). There are three (3) CoS bits, so the CoS value is between 0 and 7.
Generally, inbound ports have no intelligence and either recognize or ignore CoS bits. The CoS bits are generally copied to an internal header so the information is at least available within a switch. If the outbound port is a trunk, the CoS bits are generally put into the appropriate frame header field.
Layer 3 switches can deal with various protocols. For IP traffic, the IP Precedence (0-7) or Diff Serve (DSCP, 0-63) interpretation of the ToS field in the IP header is another way of marking, discussed in previous articles. Layer 3 switches can take CoS bits and use those to set IP Precedence or DSCP bit, or vice versa, generally by means of a mapping table. For non-IP traffic, all we've got is the CoS bits to work with. Layer 3 switches will generally set CoS or other bits on the outbound port or interface. They may also set the IP Precedence or DSCP bits.
Switches with higher level QoS just use upper protocol stack layers to classify in a more precise fashion.
Another key concept is that of trust. Does the switch trust or believe any CoS or IP Precedence or DSCP bits sent to it? Note that if the PC NIC card driver is 802.1Q capable, both the CoS and IP Precedence bits can be set by an end-user's PC. Do you want your savvy users upping their IP Precedence to 5, the same as Cisco VoIP traffic? There are rumors that under Win2000, a well-known browser that experiences congestion sets IP Precedence to 5. This is another policy issue you're going to have to decide.
The Cisco IP telephones also play a role in all this. The newer phones are actually 3 port switches. Enhanced CDP and switch configuration allow us to clue the phone in on the auxiliary VLAN (voice vlan). The phones can then send VoIP traffic with 802.1Q headers indicating that VLAN, along with CoS bits set to 5, and IP Precedence 5 within the IP header. When the phones pass through traffic from a connected PC, the PC traffic can be converted to 802.1Q, labelled with the native VLAN for the port, and can have a specified CoS written to the frame. If you so choose.
If you follow this logic, switches become a very natural place to be doing QoS. Where else are you going to be able to readily distinguish your PC traffic from your VoIP traffic? Yes, routers can use Access Lists to distinguish the two, particularly if your IP phones get addresses out of a different address block. But by classifying and marking the VoIP with CoS bits, you can guarantee the VoIP traffic expediting handling through the campus LAN.
Queues are the other thing to think about with switches. What we'd like to do is use CoS bits to put traffic into different queues. Many of the newer Cisco switching blade now support at least two transmit queues. When there are two queues, one might have priority, or a form of weighted round robin (WRR) might be used. (Custom Queuing is a form of WRR, if you're familiar with it.) So if you're shopping for Cisco switch hardware QoS support, look at receive but especially transmit queues and how they behave.
The other function Cisco has been building into switch queues is queue management. This might be full Weighted Random Early Detection, WRED, described in previous articles. In WRED, the CoS or Precedence bits get used to differentially randomly start dropping packets or frames as the queues begin to fill up. The other approach that is being used in Cisco switches is multiple thresholds for tail drop. As I understand it, under this scheme, when each threshold is reached, all frames below a certain priority level will be dropped. Details do vary with switch model and blade.
I've built the following table from various Cisco information sources. Any errors are my fault. The information may be out of date (and will certainly become out of date over time), so treat it with caution. Everything I'm seeing suggests there may be more queues coming, and with more features.
||Ports and Queues
|2900 XL, 3500
||10/100 ports: high and low priority queues. CoS 0-3 go to low, 4-7 go to high.
||Frames without 802.1Q or ISL can be classified and assigned internal CoS value.
802.1Q or ISL frames cannot be reclassified. There is no notion of untrusted ports.
Internal CoS value is written to frame on outbound trunk ports.
||GbE ports: 8 queues, only high and low in use right now, same as 10/100 ports
||Similar to above.
show port capabilities
|Similar to above.
If QoS is enabled, all traffic goes into queue 1. Recommended: modify default CoS to transmit queue mappings.
|4000 Layer 3 switch WS-X4232
2948 L3, 4908 L3, 85xx
|High-order IP Precedence bits determine which of 4 queues is used.
WRR between queues
|No reclassification or L3 remarking
||Line cards have 1 queue, 4 thresholds at 30%, 50%, 80%, 100%.
show port capabilities
|SAINT5 based cards can mark internal CoS with CoS then set on an outgoing trunk port
NFFCII can associate internal CoS with MAC address
NFFCII can examine/set IP Precedence with certain egress line cards: 5225R, 5201R and others, can set CoS to match IP Precedence bits
||"Non 1A" 10/100 cards: 1 Rx queue with 4 tail drop thresholds (1q4t), 2 Tx queues with 2 tail drop thresholds each (2q2t). Rx thresholds are at 50%, 60%, 80%, 100% (CoS 0-1, 2-3, 4-5, 6-7). Tx thresholds at 80% and 100%.
"Non 1A" GbE: 1 Rx, 2 Tx, more buffering
|GbE: can use set port qos trust
10/100: define a QoS ACL, then "set port qos ... trust trust-cos".
||1A cards: 2 Rx, 3 Tx. One Rx and one Tx queue is strict priority for VoIP, etc. (CoS 5). In summary: 1p1q4t, 1p2q2t.
WRR on the non-priority queues. Tail drop except 1A version of GbE does WRED.
|Without PFC: can copy CoS to internal header, CoS gets set upon trunk port egress
||There are also cards with Rx 1p1q0t, Tx 1p3q1t.
||PFC/MSFC: CoS and ToS bits can be read and changed
||show port capabilities
||6000 also supports Policing, both microflows and aggregates, distributed CAR for the WAN
The following are good overview public links to CCO about QoS in switches. They are all marketing whitepapers, but do contain some useful technical details, or ideas on how to use QoS in campus switches. The first of them has a nice chart about which switches support which QoS features.
The other good reference for QoS in switches is of course the specific CCO documentation for each switch model. Once you've gotten the big picture (above!), this becomes a lot more accessible. So you might take a look at the following
2900 XL, 3500: http://www.cisco.com/univercd/cc/td/doc/product/lan/c2900xl/29_35wc/sc/swgports.htm
4000, 2948G, 2980: http://www.cisco.com/univercd/cc/td/doc/product/lan/cat4000/rel_6_1/conf/qos.htm
2948 G L3: http://www.cisco.com/univercd/cc/td/doc/product/l3sw/2948g-l3/rel_12_0/7wx515a/config_g/qos_sum.htm
Other L3 switches: look under http://www.cisco.com/univercd/cc/td/doc/product/l3sw/index.htm
Of these, only the 5000 and 6000 series QoS documentation will take much time to read.
Configuring QoS in Switches
Out of space! This looks like it'll have to wait for another article. It is also very switch- and even blade-specific.
QoS Policy Manager is of course one way of applying QoS policy consistently across a group of Cisco devices.
, updated 11/2/2005Copyright (C) 2001, Peter J. Welcher