CNC Logo

Next Hop Resolution Protocol

Peter J. Welcher


Introduction

This month we'll take a look at Next Hop Resolution Protocol (NHRP). I started looking at this protocol a few months back, and then started thinking, "well, nobody has a switched cloud big enough for this to be interesting." But a couple of things have changed: Let's take those in order, to see why NHRP might be something we need to know about.

WAN Clouds

Most Frame Relay networks currently seem to be star topology, using Permanent Virtual Circuits (PVC's). Perhaps there are several stars, connected with a LAN backbone. This reduces the number of peers the hub router has to talk to. And it works well as long as most traffic is going to Headquarters (HQ), at the hub of the star.

But when traffic starts going between random pairs of locations, there's a price to pay in star topologies. Traffic between remote sites must go to the hub router and back. If there are multiple hub routers, traffic must go to one router, across the LAN, then back out from the other router. This incurs store and forward delay.

That's fine as long as the remote site traffic is low volume, or goes to and from Headquarters. But if the remote sites need to transfer more substantial amounts of information, doesn't it make more sense to send traffic directly between them? That's where the Frame Relay Switched Virtual Circuits (SVC's) come in. They provide on-demand connectivity between sites. This potentially can be used to supplement or replace the star of PVC's to HQ. And it can potentially do "short-cut switching", where the Frame Relay switches directly carry the traffic between sites, vice carrying to HQ and then back out. That's where NHRP comes in.

This need not be limited to Frame Relay. It's also possible to do SVC's in ATM. And (depending on your carrier), it may soon be possible to mix and match Frame Relay over ISDN, Frame Relay, and ATM for low, medium, and high speed sites -- but that's another article! (Cisco / Stratacom calls this internetworking, if you want to look at the documentation CD).

Finally, researchers are looking at how to efficiently do multicast over NBMA clouds. NHRP may play a role in this.

To add a dose of reality here: AT&T and other companies are beginning to offer WAN ATM services. MCI is offering Frame Relay SVC's.

Cisco IOS Release 11.2 supports Frame Relay SVC's. It supports NHRP for IP and IPX over ATM, SMDS, and multipoint tunnels (but not Frame Relay SVC's). The NHRP draft does discuss encapsulating NHRP for Frame Relay.

So the technical and service pieces are falling into place. What isn't so clear yet (to me, at least) is the financial piece: is there sufficient cost savings to warrant the additional configuration hassle to SVC's? Nor is the design piece of this clear: when are WAN PVC's appropriate, and when are WAN SVC's more appropriate?

MPOA

There's another situation where NHRP is in our future, however. The Multi-Protocol Over ATM (MPOA) standard extends LAN Emulation (LANE) in networks with ATM LAN backbones interconnecting switches. MPOA provides for short-cut switching between Emulated LAN's (ELAN's), using a route server. In other words, MPOA allows the switches to take traffic off one VLAN or subnet and, in effect, route it to another without going through a router (well, more than once). This bypasses the approach typically used now: router on a stick (or RSM cards in Cat5000's, routing at the edges). Some vendors really think this approach is the way to go (possibly because they don't have a competitive router product?) Cisco apparently plans to offer MPOA, although a quick search of their Web page lead to amazingly few relevant hits, and none that said anything specific.

At the heart of MPOA, we also discover NHRP, slightly modified.

So What Does NHRP Do?

If you're doing large scale Frame Relay or ATM, you might have anywhere from 25 to 100 routers in a PVC star off each hub router interface. And perhaps a number of such hub routers or interfaces, perhaps on a LAN backbone, although that goes beyond the scope of this article.

Each hub interface leads to one NBMA subnet in the WAN cloud. Or, if you've been listening to my advice, you've used subinterfaces and so each physical hub interface contains many different subnets, probably VLSM microsubnets. The PVC star gives the routers some basic ability to talk to each other, pass routing information, and so on. But note that remote routers c2500-A1 and c2500-A2 or c2500-A1 and c2500-B1 in the picture below do not by default have mapping or routing information that allows them to pass packets directly between them.

One reason for this is that dynamic routing tells c2500-A1 the next hop for destination D is the hub router BigRouterA, and so there's no dynamic way to get it to want to directly pass packets destined for D to c2500-A2. Since c2500-A1 and A2 don't talk directly via PVC, there's no way for A1 to receive routing updates from A2.

That can be fixed, by putting in a static route. That doesn't scale very well; it's too labor-intense. In addition to this, there's the minor problem of entering a "phone book", putting in the appropriate SVC address ("phone number") for each possible remote site.

So we build up a table of static routes to all possible destinations behind possible SVC peers, and we build up a phone book of the data layer addresses ("phone numbers") of all the SVC peers. At least we can cut and paste it into all the routers. Still a rather tedious job. There's got to be a better way!

NHRP solves the first of these problems. Some of you may recognize this as an ARP problem. Classical IP has ATM ARP. "So why is NHRP needed", you ask. Well, ATM ARP has one server per subnet. It doesn't cross subnet boundaries. NHRP allows you to cross LIS (logically independent subnet) boundaries.

It tracks down the exit point from the cloud, by tracking it down along the routing path. This entails sending a NHRP request directly on the data layer, specifying the desired destination. (It does not encapsulate the NHRP in IP, which is what I read the Cisco documentation as saying). So to find a short cut to D, c2500-A1 sends a NHRP request for D. (There is a special NLPID and SNAP code, matching the usual NBMA encapsulation). Router c2500-A2 reports back to c2500-A2 with its NBMA link address.

This doesn't work unless most or all routers along that path are NHRP-capable and have NHRP enabled. Such routers are called Next Hop Servers (NHS's). The Next Hop Server keeps a cache of information (essentially, taking notes as requests or replies go by). The cache consists of a mapping table of protocol and NBMA link addresses.

In addition to the dynamically learned entries, the router's NHRP cache can also be statically configured. This seems somewhat similar to DNS, in that a router is authoritative for a certain range of IP addresses, and can act much like an ARP server. While this is static, it is certainly simpler than having to put the static information on a large number of devices.

Limitations

Currently NHRP cannot cross the Ethernet between BigRouterA and BigRouterB. There's a basic problem here: how do the routers know that the ATM clouds are the "same" cloud or different service providers? NHRP requests and replies cannot leave the NBMA cloud. This is research in progress.

Routers receiving replies from other routers are at risk of creating routing loops if there are "back doors" -- non-NBMA routes between the end subnets. This is a known problem and is supposedly being worked on.

The NHRP draft document talks about concern about a "domino effect", whereby NHRP requests trigger a cascade of such requests. Good point, interesting "gotcha" for a sloppy implementor. The practical point: the implementor has a choice of what to do. One idea is to only trigger NHRP requests off packets received on non-NBMA interfaces. Another is to control the rate at which NHRP requests are generated. In particular, every packet for a particular destination should not trigger a NHRP request!

There's a question of what a router is to do with a triggering packet when it sends out an NHRP request. It could drop it, it could save it in the hope a NHRP reply will provide a better path, or it can forward the packet using normal routing. The third of these is the recommended default. Since the packet can be forwarded, there's no point to dropping it. And since NHRP may or may not provide a reply, especially if there's a NHRP request rate limitation (see below), it's probably better not to wait. That way NHRP accelerates long-lived flows, and may or may not be useful for the short-lived ones.

Configuring NHRP

Details are in the IP Routing section of the Cisco Configuration Command Guide manual (and the corresponding Reference manual).

Start by configuring interfaces with which logical NBMA cloud they belong to:

ip nhrp network-id number
You also have to tell the router when to send off a NHRP query. Use the interface command:
ip nhrp interest acl-number
with the number of a standard or extended access list.

You may instead trigger NHRP (on an interface) when there are a certain number of packets for a destination:

ip nhrp use number
Normally, NHRP follows the information in the routing table in tracking down the NBMA cloud egress router. If you wish to statically configure the Next Hop Server (NHS) for certain IP addresses, use:
ip nhrp nhs nhs-address [network [netmask]]
If you have PVC's, you may not need the next step. You can statically configure a router (interface) with NHRP data layer address:
ip nhrp map ipaddress data-layer-address
To configure authentication (to prevent accidental NHRP peering), use the following interface command:
ip nhrp authentication string
The string is up to eight (8) characters.

We can control the maximum rate at which NHRP packets will be sent. The default is 5 packets every 10 seconds (per interface). If you wish to change this on a certain interface, use the interface command:

ip nhrp max-send packets every time-interval
To help in troubleshooting, NHRP includes IP Route Record option in queries and replies. To disable this, configure an interface with:
no ip nhrp record
You can also configure the address an NHRP server supplies as the responder address, and the length of time the NHRP information is advertised as valid for (default is 7200 seconds, or 2 hours):
ip nhrp responder interface-type interface-number
ip nhrp holdtime seconds-for-positive-response [seconds-for-negative-response]
These are interface commands.

Oh, you were wondering about NHRP for IPX? Substitute "ipx" for "ip" above. That's it!

NHRP Show and Debug Commands

The command
show ip nhrp [dynamic | static] [interface-type interface-number]
shows the NHRP cache. The options dynamic and static limit the display to dynamic or static cache entries. An interface may optionally be specified.

To look at NHRP traffic information:

show ip nhrp traffic
To clear a static cache entry:
no ip nhrp map ...
To clear dynamic cache entries:
clear ip nhrp
If there are problems with NHRP, you might try:
debug nhrp
or
debug nhrp options
or
debug nhrp rate
Question for the reader: are any of these likely to be verbose debug commands?

Summary

I'm still a big fan of Tag Switching for the WAN (see the article and links from 4 months ago, below). Concerning Tag Switching, there seems to be a bit of a question as to whether WAN service providers will want you and your routing protocol sharing routing with their other customers, or dynamically setting up VC's in their switches (and we're not talking SVC's here, we're talking switch by switch VC addition, where one bug might disrupt their entire business). If you're a big ISP, that's the business you're already in within your own network, and so Tag Switching is great. But for a Frame Relay or ATM service provider, this is the moral counterpart to an ISP blindly accepting any and all routes from their customers: a recipe for chaos!

So MPOA might be what we end up doing in the WAN to use PVC's. And even if not, MPOA may be what we end up doing on the campus, so it too seems worth a look I'll try to discuss MPOA in a future article -- but no promises!

Links

Various documents relating to NHRP can be found at http://www.ietf.cnri.reston.va.us/ids.by.wg/ion.html

My words and links on Tag Switching are at http://www.netcraftsmen.net/welcher/papers/tagswitc.htm

MPOA, see the list of specifications at http://atmforum.com/atmforum/specs/approved.html


Dr. Peter J. Welcher (CCIE #1773, CCSI #94014) is a Senior Consultant with Chesapeake NetCraftsmen. NetCraftsmen is a high-end consulting firm and Cisco Premier Partner dedicated to quality consulting and knowledge transfer. NetCraftsmen has nine CCIE's, with expertise including large network high-availability routing/switching and design, VoIP, QoS, MPLS, network management, security, IP multicast, and other areas. See http://www.netcraftsmen.net for more information about NetCraftsmen. Pete's links start at http://www.netcraftsmen.net/welcher . New articles will be posted under the Articles link. Questions, suggestions for articles, etc. can be sent to pjw@netcraftsmen.net . 



12/97
Copyright 1997, Peter J. Welcher