MPLS VPN's From the Customer Side

   
  Peter J. Welcher
 
   
 

Introduction

Marty and I hope you enjoyed the previous two articles on Wireless LAN technology and security. Past articles can be found under the links at http://www.netcraftsmen.net/welcher .

This month I'd like to discuss a what MPLS VPN's look like from the customer side. We've got consulting customers doing this now. And I've recently been noticing pundits in the trade magazines saying that ATM is the old approach now (still popular but aging) and that MPLS VPN and Managed IP services are the New Thing. Providers include AT&T (who's been doing MPLS VPN for quite a while), Equant (international!), Schlumberger (international!), Cable & Wireless (betting the company on IP services and dumping their Frame Relay and ATM business), perhaps Sprint now (recent announcements). 

Why? Well, note that in the ATM world, there all of a sudden aren't that many choices with good balance sheets. There's your Local Exchange Carrier, AT&T. Hmm, Sprint perhaps. Then MCI and any other financially-challenged. Did prices just go up? Oh, and ATM interfaces aren't cheap. IMA is a nice alternative to DS-3, but carrier support for IMA appears to be mixed. 

If MPLS VPN and other IP VPN services are more competitive than ATM, then maybe they'll be more financially attractive. And if you can outsource at least some of your routing with Managed IP, then labor costs may go down. Or you don't have as big a recruiting and retaining problem, perhaps. Or you have time to do some of that network management or security work you've been trying to get around to.

We've also seen fairly new real-world routing problem for some MPLS VPN customers. My question for you is, how would you solve this? We'll go into the issue at the end of the article. It's not anything to be too concerned about, but knowing about it may affect your design decisions if you're implementing MPLS VPN. Thanks to our Kevin Hynes in New York for asking for my (and others') opinion on this interesting problem. 

These articles usually start with links to prior articles. I'll note here that I've done several prior articles on MPLS, also I've posted a PDF of MPLS seminar slides. I've put the links to them at the bottom of this article, since you may not need to read them ! The point is, we don't all need MPLS in our networks. Many enterprise networks may well be consumers of MPLS-based services, which is very much simpler than running MPLS and acting as an MPLS Service Provider. As a consumer of MPLS services, we perhaps might want to know some BGP (see below), but we might not even need that. Details below. 

One other topic we'll touch upon along the way is backup links. They've become a real design challenge as speeds increase. What do you do for redundancy?

When Would I Want MPLS in My Network?

If you're a big enterprise, governmental entity, or perhaps university, MPLS VPN's might be in your present or your future. MPLS VPN's allow simple control of routing information in a shared network. One of the things to look for: do you have Business Units or Departments that need some degree of isolation for security reasons? Do they or do your offices have differing patterns of external connectivity requirements?  Then MPLS VPN's are one tool for tackling this. MPLS VPN's get you out of the business of doing distribute-list s and route-maps with standard BGP communities and confederations and AS Path filtering.Of course, some of that complexity just moves elsewhere.But a lot of the changes are something you do once in your core, then don't have to tinker with again. And much of the VPN setup is so routine there are tools like Cisco VPNSC to do it for you. 

MPLS VPN's also mean the core network doesn't need to see "customer" (department) routes, and that the "customer" networks can even have overlapping private addressing (as long as they duplicated subnets don't have to communicate with each other). So if you're in a big company coping with post-merge network-10-itis (the two merged companies both used network 10), perhaps MPLS VPN could help in a rapid consolidation of network services. Perhaps. I can't say it's The Solution, just that it might be part of A Solution. 

Caution: if you have a switched network using 6500's and MSFC's, well, MPLS for all interfaces (not just OSM's) is coming Real Soon Now (don't hold your breath).

What About the Rest of Us?

All the rest of you, you probably don't need to run MPLS in your network. At least, not for this. If you were looking for an excuse to go learn MPLS, I hate to disappoint you, but that's the way it is.

Where MPLS might enter your life is if you buy MPLS-based VPN services from a provider. What you need to know about MPLS is what it does for you, to make you an educated consumer of MPLS services. And so I thought in this article we'd pursue that topic in depth.

Terminology

First a small amount of terminology. When we say "customer", we mean you, the customer of the MPLS VPN or other IP Service Provider. The cloud A in the diagram is the Service Provider's network. A Provider Edge Router is one of the PE routers in the diagram, the router at the provider that connects to your network (and quite possibly someone else's as well). Routers R1 through R4 are Customer Edge or CE routers.

One Cloud Image

In this article we're going to focus on MPLS VPN, because with RFC 2547 MPLS VPN's, there is one real novelty: the Service Provider is providing the customer with routing. (If they offer instead Layer 2 MPLS VPN's, your Ethernet or Frame Relay or ATM is being tunneled through the MPLS network, and your routing packets are just one more packet tunneled as Layer 2 inside Layer 3. L2TP and GRE tunneling may soon be in use providing the same sort of services for you, per recent Cisco announcements, Unified VPN's.)

PE-CE Routing

In the Cisco implementation of MPLS VPN, the Service Provider has 4 choices for PE-CE routing. Which of these they offer the customer is of course up to them and how they view their service offering.

The four PE-CE protocols are:
  • static routing
  • RIPv2
  • OSPF
  • eBGP
There may be a fifth: Cisco has announced that it will support for EIGRP as PE-CE routing protocol. See Networkers 2002 talk RST-253 (your SE can get this for you, it's also posted on CCO but I don't feel free to put the URL in print). The feature is mentioned in the new feature document for 12.0(22) S but not mentioned in the list of features at the top, and the details link is broken. So the feature may have been pulled or may not yet be officially supported. See also http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/relnote/7000fam/rn120s.htm#64834 .

(Added 12/30/2002) The detailed link re EIGRP for PE-CE is now available. See http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/120newft/120limit/120s/120s22/seigpece.htm .

Inside the Service Provider (SP) network, the SP is running MBGP, Multiprotocol BGP. That's part of how they isolate your routes from the routes for their other customers. The only reason you might care is that the overall end-to-end convergence is that of BGP. Timers can be tweaked but generally "speedy" and "BGP" are not words you commonly see in the same sentence. BGP scales well, but part of the reason is that it does things at a measured controlled pace.

The static approach works by SP creating customer-specific ("VRF") static routes and redistributing them into MBGP. Your routers are set up with a static default pointing at the PE as next hop. Note how simple this makes the CE router configurations: no dynamic routing, just addresses on interfaces, manageability features, hostname, default route, things like that. Nice and easy for a SP managed service!

This approach works well if you have small sites (say, one subnet each), and not too much change. If your network is large or changing rapidly, then you might want a more dynamic approach to routing.

The next choice is RIPv2. I don't like recommending RIP for anything, but it might be considered here. Yes it is slow to converge, but the SP's MBGP might also be slow. And if there are no alternative routes, it's not like you're getting there by another path.

External BPG, eBGP, is also nice, if you're currently doing it with your ISP, you'll feel right at home. Not for long though: because you might end up running eBGP at every site connecting to the MPLS SP. Hmm, that doesn't sound simple all of a sudden ...

OSPF is nice and fast to converge. There are special features that mean that the SP MPLS network can appear to the customer as part of "SuperBackbone area 0". Although routes are carried across the SP cloud in MBGP, they are re-injected into OSPF by the PE router's OSPF as internal inter-area routes. The first drawback here is that the current Cisco implementation separates VPN customers by using a different OSPF process for each, max of 28 per router. So your SP may not volunteer that this option is available. (Buying 3700's or 7200's to connect OSPF MPLS VPN customers would seem to be a cost-effective answer.) The second drawback is that yes the SP cloud looks like part of your OSPF backbone, but it has all the convergence speed of BGP unless tweaked.

By the way, the new EIGRP for PE-CE support will allow basically the same thing: carrying EIGRP routes and metrics through the MBGP Service Provider routing in the MPLS VPN network, then back out to the CE router as internal EIGRP routes.

We'll come back to the point about re-injecting the routes into OSPF or EIGRP as internal routes, when we're in a position to see why that's a Good Thing.

Backup Paths

Let's detour slightly into backup links. You'll see the connection of topics shortly.

Have you been noticing that speeds are going up again? What are you using to back up your primary links? If your remote sites have been pushing into the T1 or solid fraction of T1 range, you've probably noticed that ISDN dial backup is looking mighty expensive. PRI cards aren't cheap, and neither are PRI connections. One alternative is single carrier dual star, i.e. two sets of FR or ATM links to different HQ routers. All of a sudden people are moving one of the HQ routers for geographic diversity to two hub sites that are "reasonably far apart". Except that when you do that, you generally end up wanting a big pipe between the two hub sites, which also adds cost.

The other approach is dual carrier. This is attractive due to all the financial surprises lately. It covers you in case your primary carrier comes out with a nasty surprise and little time for you to replace them at your 50, or 200, or 1000 sites. (The definition of "little time" varies with size.) Some have noticed that two major carriers had FR outages of 2-3 and 10 days, and that most businesses cannot afford to be without a WAN network for 2-10 days. The problem here is cost.

We've been reading a good bit about how cheap it is to connect by VPN. Usually this is the IPsec die-hards thinking they've got the only possible kind of VPN. And yes, VPN across the Internet is cheap. But ... TANSTAAFL (There Ain't No Such Thing As A Free Lunch), as Robert Heinlein the science fiction writer used to remind us. IPsec across the Internet can be rather slow, with no business-grade SLA's. So not only is it slow but whining to the ISP isn't going to change things. If you use multiple ISP's forget any SLA or control. Even with one, you might get good service by staying in their network -- but good luck getting an SLA out of them. From what I am hearing, Cogent has a nice 100 Mbps to the Internet for $1000/month service in certain cities. But word has it they're fairly focussed on keeping it simple, no frills service.

In short, various customers have tried IPsec VPN and found it lacking. It can be ok for email. File services and voice or video across IPsec VPN -- good luck, you'll probably get very mixed results.

Now a crucial design question to ask yourself is: if it doesn't work well enough to be my primary network, do I really want it for my backup? The answer may vary depending on your applications, needs, and budget. One theory is that something is better than nothing. Another is that if your primary applications can tolerate delay, IPsec VPN may make real sense for backup path between continents. With a good ISP having solid backbone and SLA's, IPsec VPN across that one ISP may make a lot of sense.

Suppose you've got ATM, or GRE tunnels, or IPsec VPN, or some private network. Suppose you're migrating to MPLS VPN services. What's your first thought? Probably something along the lines of "how do I keep my old network with this, I want to use it as backup path in case this new MPLS stuff (or the carrier) turns out to have problems".

I hope I've convinced you that it's quite plausible that your network might look like the following diagram. In this diagram, cloud A is the MPLS VPN provider network, and cloud B is whatever you had before (private FR or ATM, leased lines, GRE tunnels, IPsec VPN, etc.).

Two Cloud Image

If you're willing to convert to OSPF and the MPLS carrier is providing OSPF PE-CE service, there is no problem (other than checking the carrier's convergence). If you're willing to go with the static/default approach or RIPv2, no problem.

Suppose however the carrier is only doing static/default or eBGP. Default routing isn't going to work if cloud B has dynamic routing going in it. That's because you'll have all sorts of specific subnets in the routing table from the B side, and the default to a next hop PE in A will never get used. If you insist on static/default, I guess long lists of static routes on the R1-R4 routers can be made to work. When the PE next hop in A becomes unreachable, the static route will no longer over-ride the dynamic route from B. This is a bit ugly and high-maintenance, but if your network is small or unchanging, it's good enough.

That leaves us with, carrier doing eBGP. It looks like standard eBGP to you. Suppose your routers are running OSPF or EIGRP on the links in the B network. Then the eBGP has better administrative distance, and things still work out ok. You may have a bit more BGP in your network than you really wanted, but it all works reasonably cleanly and well. At least, as long as there is a single router connected to both clouds -- add more, and things get more challenging (see also below).

There is one eBGP refinement you might need here. If your routers need to be a single (registered or unregistered) BGP AS for some reason, then think about AS Path for a moment. Your routes come out of R1 with AS 1 say, then come back into R2 with the SP AS prepended, so say the AS Path is 2 1. R2 does the usual eBGP thing of checking AS Path, sees its own AS number in the path, concludes there is looping of routing information, and discards the prefixes in question. There's a Cisco feature where the SP can do AS Path rewriting before sending routes to you. So the AS Path gets turned into 2 2 instead of 2 1 and your router R2 becomes willing to learn the prefix. This even works if you're doing AS  prepending: lead copies of your AS get converted: 1 1 1 sent from R1 to PE1 turns into 2 2 2 2 as it comes back to R2.

So What's the Issue?

We've seen that PE-CE routing is relatively clean if you keep things simple.

Suppose now your MPLS VPN vendor places routers they manage at your premises. Maybe your company is buying off a managed services contract and you wish to just use it for connectivity, with your routers in parallel for control. Maybe that's the only way the vendor will sell you MPLS VPN connectivity. Maybe the services were bought by another (central?) part of the organization. Whatever the reason, your network then looks like the following diagram. The gray routers are controlled by the SP, you control the blue routers.

Two routers at each site

IF ROTN has EIGRP in it, then you'll probably ask your provider to send you EIGRP from the gray routers. The problem right now is, the EIGRP routes learned through the A side (the MPLS VPN) are external EIGRP routes, so the blue routers prefer the B side ROTN routes. If you run BGP from CE to blue router(s), if you make it iBGP you end up with iBGP routes competing with internal EIGRP routes, and again the blue routers prefer the B side routes. And since this is administrative distance, tweaking the metrics can't help. (Recall that Administrative Distance is used for each prefix to decide which routing protocol to use for routing of that prefix, and metric only played a role within that one protocol as far as determine its best route.)

Here's a list of some ways around this problem:
  • Switch the network to OSPF
  • Run eBGP  from CE to R routers
  • Change the EIGRP administrative distance so external routes are preferred over internal (router distance command)
  • Use the router distance command neighbor variant to make the routes learned from ROTN neighbors less attractive
  • Turn off EIGRP on the ROTN side and use floating static routes
  • (Added 12/30/2002) Get your MPLS VPN Provider to upgrade to 12.0(22) S or newer code and support EIGRP as MPLS PE-CE protocol.
You might think about which is best. OSPF is probably the cleanest. Running eBGP is a bit baroque on the CE routers (two eBGP sessions, one on each side) but should work as long as all the routers connected to the B side also speak eBGP to a CE router. EIGRP administrative distance (AD) changes aren't terrible, but when messing around with external route AD, one needs to think about any other redistributed routes, and surprises coming from making them rather attractive. The neighbor distance variant is a bit more controlled in this regard. One might do it where one cautiously bumps the external route AD up to something like 100 or so (a little worse than internal EIGRP routes), then distances the neighbors down to something like say AD 120 or so.

By the way, the distance command, neighbor variant, only works with internal EIGRP routes, not with external ones (not documented, found through testing).
 
I trust you've spotted the other issue here. You really want to impose manual split horizoning, so that A side routes don't get passed through the B side and vice versa. If you don't control this, then you might see some odd behavior when a link fails, such as site C losing the MPLS VPN link, and packets re-routing to C via MPLS to another site then ROTN from there to C. EIGRP allows tagging of routes with a simple route map, so it's easy to tag routes as you import them into EIGRP, and then filter based on tag to prevent them from getting advertised back out on the MPLS side (if there's redistribution going on there into BGP). On the BGP side, you might use a BGP standard community as a tag for all the redistributed prefixes. You do need to configure BGP to advertise standard communities, and you'd have to work with the MPLS VPN provider to make sure they were doing the same. (If they can't or won't, then your task just got a bit harder.)

Conclusion

Welcome to Derek, a new (and old) co-worker in the DC area. Derek joins us from Juniper, where he taught a number of providers about the J approach to MPLS. Derek is CCIE  R&S #6146, with strengths in routing and switching, also JNCIE #42,
JNAT (Juniper trainer), CCNP, CCDP, JNCIS. Inactive certs include CCSI, CNE, CNX, and CCSE (CheckPoint).

For those who think it would be helpful to know more about MPLS, here are the links to my previous articles on the topic. They in turn have other helpful links in them.

Seminar slide PDF's are at

Dr. Peter J. Welcher (CCIE #1773, CCSI #94014) is a Senior Consultant with Chesapeake NetCraftsmen. NetCraftsmen is a high-end consulting firm and Cisco Premier Partner dedicated to quality consulting and knowledge transfer. NetCraftsmen has nine CCIE's, with expertise including large network high-availability routing/switching and design, VoIP, QoS, MPLS, network management, security, IP multicast, and other areas. See http://www.netcraftsmen.net for more information about NetCraftsmen. Pete's links start at http://www.netcraftsmen.net/welcher . New articles will be posted under the Articles link. Questions, suggestions for articles, etc. can be sent to pjw@netcraftsmen.net .


10/2/2002
Copyright (C)  2002,  Peter J. Welcher