|
In the last week I talked to two different customers about issues with their port-channels, vPCs, and Spanning-Tree Protocol. The first customer thought he had a STP issue between his N7Ks and his N5Ks, but he actually had vPC design issue. The second customer also had the same design issue, but did not know about it until I pointed it out to him. Since I thought it was interesting that both had the same issue, I thought I would write up a summary.
Broken Design The broken design looks like the following diagram:

The same set of VLANs are supported on both port-channels. Do you see the issue?
Both customers wanted to have a lot of bandwidth between the N7K and the N5K pairs. However, their network was not forwarding traffing on all the links. The show spanning-tree vlan 10 command should help illustrate the problem:
N7K1# sh span vlan 10
VLAN0010 Spanning tree enabled protocol rstp Root ID Priority 32778 Address 0005.73ba.8abc Cost 1 Port 4109 (port-channel14) Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10) Address 0026.980a.8d42 Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Po1 Desg FWD 1 128.4096 (vPC peer-link) Network P2p
Po11 Root FWD 1 128.4109 (vPC) P2p
Po12 Altn BLK 1 128.4110 (vPC) P2p
. . . N7K1
N7K2# sh span vlan 10
VLAN0010 Spanning tree enabled protocol rstp Root ID Priority 32778 Address 0005.73ba.8abc Cost 2 Port 4096 (port-channel1) Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10) Address f866.f210.8ea2 Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Po1 Root FWD 1 128.4096 (vPC peer-link) Network P2p
Po11 Root FWD 1 128.4109 (vPC) P2p
Po12 Altn BLK 1 128.4110 (vPC) P2p
. . . N7K2#
N5K1# sh span vlan 10
VLAN0010 Spanning tree enabled protocol rstp Root ID Priority 32778 Address 0005.73ba.8abc This bridge is the root Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10) Address 0005.73ba.8abc Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Po10 Desg FWD 1 128.4096 (vPC peer-link) Network P2p
Po11 Desg FWD 1 128.4109 P2p
. . . N5K1#
N5K2# show span vlan 10
VLAN0010 Spanning tree enabled protocol rstp Root ID Priority 32778 Address 0005.73ba.8abc Cost 1 Port 4096 (port-channel1) Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Bridge ID Priority 32778 (priority 32768 sys-id-ext 10) Address 0005.73bc.de3a Hello Time 2 sec Max Age 20 sec Forward Delay 15 sec
Interface Role Sts Cost Prio.Nbr Type ---------------- ---- --- --------- -------- -------------------------------- Po10 Root FWD 1 128.4096 (vPC peer-link) Network P2p
Po12 Desg FWD 1 128.4110 P2p
. . . N5K2#
Logical View So the design issue is that the two port-channels between the N7K and the N5K layers are creating a Layer 2 loop, and STP is appropriately blocking one set of interfaces from forwarding traffic.

Please note that replacing the port-channels on the N5Ks with vPCs on N5K will NOT resolve the issue -- if you have the same VLANs on both vPCs you will still have blocking on the VLANs.
Recommended Design What both customers need to do is put all the links between the two Nexus pairs into one large vPC. With this small design changes, all links will be forwarding between the devices:

vPC and Port-Channel Operations I offer some observations from my migration tests of vPCs and port-channels on Nexus devices in case you ever need to migrate port-channels to vPCs on a production network:
- When a single interface is configured as a port-channel, the interface resets. (I saw a loss of 4 packets on a long ping test across the link, and the log file shows the interface down for 5 seconds.)
- When an interface is added to an operational port-channel, the interface resets but the port-channel and existing interfaces are not reset.
- When a port-channel is reconfigured as a vPC, the port-channel and its interface resets. This is disruptive, even when trying to minimize the impact I saw a loss of 7 packets in my tests. (I had configured the secondary switch in the vPC pair with a shutdown interface on the member port, put it in the vPC, then configured the primary switch. The primary switch went down. I hopped back to secondary switch, no shut the interface, and saw that the secondary switch was up in 7 seconds. In my test, the primary switch came up 15 seconds after it initially went down.)
- When the vPCs are not configured on both peers, the VLANs on the port-channel are suspended. When I removed the port-channel from the secondary switch, then reapplied it, the physical port on the primary vPC switch never went down, but when the VLANs were suspended devices lost connectivity.
- When a peer member link is added to an existing vPC (such as enabling right link when left link is up), the left link stays up.
Minimizing Downtime My customers have not yet had a chance to schedule a maintenance window to correct their designs. Depending on which port-channel is blocked, they may take a slightly diferent order in the steps to migrate to the back-to-back vPC. For either case, they will need to create the new vPC on both N5Ks, and move the ports into the new vPC on the N5K2. They will also have to move the ports out of vPC 12 into vPC11 on the N7Ks.
-- cwr
_____________________________________________________________________________________________
If you would like some additional details on vPCs, the following references may be helpful:
 |