
Transcription
BGP4 Case Studies/TutorialSam Halabi-cisco SystemsThe purpose of this paper is to introduce the reader to the latest in BGP4 terminology anddesign issues. It is targeted to the novice as well as the experienced user. For any clarification or comments please send e-mail to [email protected] 1995 Cisco Systems Inc.1/26/96-Rev: A1.2Page 1Sam Halabi-cisco Systems
1.0Introduction.41.11.21.3How does BGP work .4What are peers (neighbors) .4Information exchange between peers.42.0EBGP and IBGP .53.0Enabling BGP routing.63.1BGP Neighbors/Peers .74.0BGP and Loopback interfaces .105.0EBGP Multihop .115.1EBGP Multihop (Load Balancing) .126.0Route Maps .137.0Network command.177.17.2Redistribution.18Static routes and redistribution .208.0Internal BGP .229.0The BGP decision algorithm.2310.0As path Attribute.2411.0Origin Attribute.2512.0BGP Nexthop Attribute.2712.112.212.3BGP Nexthop (Multiaccess Networks).29BGP Nexthop (NBMA) .30Next-hop-self .3113.0BGP Backdoor .3214.0Synchronization .3414.1Disabling synchronization .3515.0Weight Attribute.3716.0Local Preference Attribute.3917.0Metric Attribute .4118.0Community Attribute .4419.0BGP Filtering .4519.119.219.320.0BGP Neighbors and Route maps .5320.120.221.0Route Filtering .45Path Filtering.4719.2.1 AS-Regular Expression .49BGP Community Filtering.50Use of set as-path prepend .55BGP Peer Groups.56CIDR and Aggregate Addresses .581/26/96-Rev: A1.2Page 2Sam Halabi-cisco Systems
21.121.221.3Aggregate Commands.59CIDR example 1 .61CIDR example 2 (as-set).6322.0BGP Confederation.6523.0Route Reflectors.6823.123.223.3Multiple RRs within a cluster .71RR and conventional BGP speakers .73Avoiding looping of routing information.7424.0Route Flap Dampening .7525.0How BGP selects a Path .7926.0Practical design example: .801/26/96-Rev: A1.2Page 3Sam Halabi-cisco Systems
1.0 IntroductionThe Border Gateway Protocol (BGP), defined in RFC 1771, allows you tocreate loop free interdomain routing between autonomous systems. Anautonomous system is a set of routers under a single technicaladministration. Routers in an AS can use multiple interior gatewayprotocols to exchange routing information inside the AS and an exteriorgateway protocol to route packets outside the AS.1.1 How does BGP workBGP uses TCP as its transport protocol (port 179). Two BGP speakingrouters form a TCP connection between one another (peer routers) andexchange messages to open and confirm the connection parameters.BGP routers will exchange network reachability information, thisinformation is mainly an indication of the full paths (BGP AS numbers)that a route should take in order to reach the destination network. Thisinformation will help in constructing a graph of ASs that are loop freeand where routing policies can be applied in order to enforce somerestrictions on the routing behavior.1.2 What are peers (neighbors)Any two routers that have formed a TCP connection in order to exchangeBGP routing information are called peers, they are also called neighbors.1.3 Information exchange between peersBGP peers will initially exchange their full BGP routing tables. Fromthen on incremental updates are sent as the routing table changes. BGPkeeps a version number of the BGP table and it should be the same for allof its BGP peers. The version number will change whenever BGP updates thetable due to some routing information changes. Keepalive packets are sentto ensure that the connection is alive between the BGP peers andnotification packets are sent in response to errors or specialconditions.1/26/96-Rev: A1.2Page 4Sam Halabi-cisco Systems
2.0 EBGP and IBGPIf an Autonomous System has multiple BGP speakers, it could be used as atransit service for other ASs. As you see below, AS200 is a transitautonomous system for AS100 and AS300.It is necessary to ensure reachability for networks within an AS beforesending the information to other external ASs. This is done by acombination of Internal BGP peering between routers inside an AS and byredistributing BGP information to Internal Gateway protocols running inthe AS.As far as this paper is concerned, when BGP is running betweenrouters belonging to two different ASs we will call it EBGP (ExteriorBGP) and for BGP running between routers in the same AS we will call itIBGP (Interior BGP).EBGPAS100IBGPAS300AS2001/26/96-Rev: A1.2Page 5Sam Halabi-cisco Systems
3.0 Enabling BGP routingHere are the steps needed to enable and configure BGP.Let us assume you want to have two routers RTA and RTB talk BGP. In thefirst example RTA and RTB are in different autonomous systems and in thesecond example both routers belong to the same AS.We start by defining the router process and define the AS number that therouters belong to:The command used to enable BGP on a router is:router bgp autonomous-systemRTA#router bgp 100RTB#router bgp 200The above statements indicate that RTA is running BGP and it belongs toAS100 and RTB is running BGP and it belongs to AS200 and so on.The next step in the configuration process is to define BGP neighbors.The neighbor definition indicates which routers we are trying to talk towith BGP.The next section will introduce you to what is involved in forming avalid peer connection.1/26/96-Rev: A1.2Page 6Sam Halabi-cisco Systems
3.1 BGP Neighbors/PeersTwo BGP routers become neighbors or peers once they establish a TCPconnection between one another. The TCP connection is essential in orderfor the two peer routers to start exchanging routing updates.Two BGP speaking routers trying to become neighbors will first bring upthe TCP connection between one another and then send open messages inorder to exchange values such as the AS number, the BGP version they arerunning (version 3 or 4), the BGP router ID and the keepalive hold time,etc. After these values are confirmed and accepted the neighborconnection will be established. Any state other than established is anindication that the two routers did not become neighbors and hence theBGP updates will not be exchanged.The neighbor command used to establish a TCP connection is:neighbor ip-address remote-as numberThe remote-as number is the AS number of the router we are trying toconnect to via BGP.The ip-address is the next hop directly connected address for EBGP1 andany IP address2 on the other router for IBGP.It is essential that the two IP addresses used in the neighbor command ofthe peer routers be able to reach one another. One sure way to verifyreachability is an extended ping between the two IP addresses, theextended ping forces the pinging router to use as source the IP addressspecified in the neighbor command rather than the IP address of theinterface the packet is going out from.1.A special case (EBGP multihop) will be discussed later when the external BGP peers are notdirectly connected.2.A special case for loopback interfaces is discussed later.1/26/96-Rev: A1.2Page 7Sam Halabi-cisco Systems
It is important to reset the neighbor connection in case any bgpconfiguration changes are made in order for the new parameters to takeeffect.clear ip bgp address (where address is the neighbor address)clear ip bgp * (clear all neighbor connections)By default, BGP sessions begin using BGP Version 4 and negotiatingdownward to earlier versions if necessary. To prevent negotiations andforce the BGP version used to communicate with a neighbor, perform thefollowing task in router configuration mode:neighbor {ip address peer-group-name} version valueAn example of the neighbor command configuration RTBIBGPRTC175.220.1.2175.220.212.1AS200RTA#router bgp 100neighbor 129.213.1.1 remote-as 200RTB#router bgp 200neighbor 129.213.1.2 remote-as 100neighbor 175.220.1.2 remote-as 200RTC#router bgp 200neighbor 175.220.212.1 remote-as 2001/26/96-Rev: A1.2Page 8Sam Halabi-cisco Systems
In the above example RTA and RTB are running EBGP. RTB and RTC are running IBGP. The difference between EBGP and IBGP is manifested by havingthe remote-as number pointing to either an external or an internal AS.Also, the EBGP peers are directly connected and the IBGP peersare not. IBGP routers do not have to be directly connected, as long asthere is some IGP running that allows the two neighbors to reach oneanother.The following is an example of the information that the command“sh ip bgp neighbors” will show you, pay special attention to the BGPstate. Anything other than state established indicates that the peers arenot up. You should also note the BGP is version 4, the remote router ID(highest IP address on that box or the highest loopback interface in caseit exists) and the table version (this is the state of the table. Anytime new information comes in, the table will increase the version and aversion that keeps incrementing indicates that some route is flappingcausing routes to keep getting updated).#SH IP BGP NBGP neighbor is 129.213.1.1, remote AS 200, external linkBGP version 4, remote router ID 175.220.212.1BGP state Established, table version 3, up for 0:10:59Last read 0:00:29, hold time is 180, keepalive interval is 60 secondsMinimum time between advertisement runs is 30 secondsReceived 2828 messages, 0 notifications, 0 in queueSent 2826 messages, 0 notifications, 0 in queueConnections established 11; dropped 10In the next section we will discuss special situations such as EBGPmultihop and loopback addresses.1/26/96-Rev: A1.2Page 9Sam Halabi-cisco Systems
4.0 BGP and Loopback interfacesUsing a loopback interface to define neighbors is commonly used with IBGPrather than EBGP. Normally the loopback interface is used to make surethat the IP address of the neighbor stays up and is independent of aninterface that might be flaky. In the case of EBGP, most of the time thepeer routers are directly connected and loopback does not apply.If the IP address of a loopback interface is used in the neighbor command, some extra configuration needs to be done on the neighbor router.The neighbor router needs to tell BGP that it is using a loopbackinterface rather than a physical interface to initiate the BGP neighborTCP connection. The command used to indicate a loopback interface is:neighbor ip-address update-source interfaceThe following example should illustrate the use of this command.Loopback Interface 1150.212.1.1RTBRTA190.225.11.1AS100RTA#router bgp 100neighbor 190.225.11.1 remote-as 100neighbor 190.225.11.1 update-source int loopback 1RTB#router bgp 100neighbor 150.212.1.1 remote-as 100In the above example, RTA and RTB are running internal BGP insideautonomous system 100. RTB is using in its neighbor command theloopback interface of RTA (150.212.1.1); in this case RTA has to forceBGP to use the loopback IP address as the source in the TCP neighborconnection. RTA will do so by adding the update-source int loopbackconfiguration (neighbor 190.225.11.1 update-source int loopback 1) andthis statement forces BGP to use the IP address of its loopbackinterface when talking to neighbor 190.225.11.1.1/26/96-Rev: A1.2Page 10Sam Halabi-cisco Systems
Note that RTA has used the physical interface IP address (190.225.11.1)of RTB as a neighbor and that is why RTB does not need to do anyspecial configuration.5.0 EBGP MultihopIn some special cases, there could be a requirement for EBGP speakers tobe not directly connected. In this case EBGP multihop is used to allowthe neighbor connection to be established between two non directly connected external peers. The multihop is used only for external BGP and notfor internal BGP. The following example gives a better illustration ofEBGP 11.1AS300RTA#router bgp 100neighbor 180.225.11.1 remote-as 300neighbor 180.225.11.1 ebgp-multihopRTB#router bgp 300neighbor 129.213.1.2 remote-as 100RTA is indicating an external neighbor that is not directly connected.RTA needs to indicate that it will be using ebgp-multihop. On the otherhand, RTB is indicating a neighbor that is directly connected(129.213.1.2) and that is why it does not need the ebgp-multihop command.Some IGP or static routing should also be configured in order to allowthe non directly connected neighbors to reach one another.The following example shows how to achieve load balancing with BGP in aparticular case where we have EBGP over parallel lines.1/26/96-Rev: A1.2Page 11Sam Halabi-cisco Systems
5.1 EBGP Multihop (Load Balancing)160.10.0.0150.10.0.0 loopback 150.10.1.1loopback 160.10.1.1RTB1.1.1.11.1.1.2RTA2.2.2.12.2.2.2AS 200AS 100RTA#int loopback 0ip address 150.10.1.1 255.255.255.0router bgp 100neighbor 160.10.1.1 remote-as 200neighbor 160.10.1.1 ebgp-multihopneighbor 160.10.1.1 update-source loopback 0network 150.10.0.0ip route 160.10.0.0 255.255.0.0 1.1.1.2ip route 160.10.0.0 255.255.0.0 2.2.2.2RTB#int loopback 0ip address 160.10.1.1 255.255.255.0router bgp 200neighbor 150.10.1.1 remote-as 100neighbor 150.10.1.1 update-source loopback 0neighbor 150.10.1.1 ebgp-multihopnetwork 160.10.0.0ip route 150.10.0.0 255.255.0.0 1.1.1.1ip route 150.10.0.0 255.255.0.0 2.2.2.1The above example illustrates the use of loopback interfaces,update-source and ebgp-multihop. This is a workaround in order to achieveload balancing between two EBGP speakers over parallel serial lines. Innormal situations, BGP will pick one of the lines to send packets on andload balancing would not take place. By introducing loopback interfaces,the next hop for EBGP will be the loopback interface. Static routes (itcould be some IGP also) are used to introduce two equal cost paths toreach the destination. RTA will have two choices to reach next hop160.10.1.1: one via 1.1.1.2 and the other one via 2.2.2.2 and the samefor RTB.1/26/96-Rev: A1.2Page 12Sam Halabi-cisco Systems
6.0 Route MapsAt this point I would like to introduce route maps because they will beused heavily with BGP. In the BGP context, route map is a method used tocontrol and modify routing information. This is done by defining conditions for redistributing routes from one routing protocol to another orcontrolling routing information when injected in and out of BGP. The format of the route map follows:route-map map-tag [[permit deny] [sequence-number]]The map-tag is just a name you give to the route-map. Multiple instancesof the same route map (same name-tag) can be defined. The sequence numberis just an indication of the position a new route map is to have in thelist of route maps already configured with the same name.For example, if I define two instances of the route map, let us call itMYMAP, the first instance will have a sequence-number of 10, and thesecond will have a sequence number of 20.route-map MYMAP permit 10(first set of conditions goes here.)route-map MYMAP permit 20(second set of conditions goes here.)When applying route map MYMAP to incoming or outgoing routes, the firstset of conditions will be applied via instance 10. If the first set ofconditions is not met then we proceed to a higher instance of the routemap.The conditions that we talked about are defined by the match and setconfiguration commands. Each route map will consist of a list of matchand set configuration. The match will specify a match criteria and setspecifies a set action if the criteria enforced by the match command aremet.For example, I could define a route map that checks outgoing updates andif there is a match for IP address 1.1.1.1 then the metric for thatupdate will be set to 5. The above can be illustrated by the followingcommands:match ip address 1.1.1.1set metric 5Now, if the match criteria are met and we have a permit then the routeswill be redistributed or controlled as specified by the set action and webreak out of the list.If the match criteria are met and we have a deny then the route will notbe redistributed or controlled and we break out of the list.1/26/96-Rev: A1.2Page 13Sam Halabi-cisco Systems
If the match criteria are not met and we have a permit or deny then thenext instance of the route map (instance 20 for example) will be checked,and so on until we either break out or finish all the instances of theroute map. If we finish the list without a match then the route we arelooking at will not be accepted nor forwarded.One restriction on route maps is that when used for filtering BGP updates(as we will see later) rather than when redistributing between protocols,you can NOT filter on the inbound when using a “match” on the ip address.Filtering on the outbound is OK.The related commands for match atchas-pathcommunityclnsinterfaceip addressip next-hopip route-sourcemetricroute-typetagThe related commands for set default interfaceip next-hopip default next-hopip enext-hoporigintagweightLet’s look at some route-map examples:1/26/96-Rev: A1.2Page 14Sam Halabi-cisco Systems
150.10.0.0RTARTB3.3.3.43.3.3.3AS 1002.2.2.2RTC2.2.2.3170.10.0.0AS 300Example 1:Assume RTA and RTB are running rip; RTA and RTC are running BGP.RTA is getting updates via BGP and redistributing them to rip.If RTA wants to redistribute to RTB routes about 170.10.0.0 with a metricof 2 and all other routes with a metric of 5 then we might use thefollowing configuration:RTA#router ripnetwork 3.0.0.0network 2.0.0.0network 150.10.0.0passive-interface Serial0redistribute bgp 100 route-map SETMETRICrouter bgp 100neighbor 2.2.2.3 remote-as 300network 150.10.0.0route-map SETMETRIC permit 10match ip-address 1set metric 2route-map SETMETRIC permit 20set metric 5access-list 1 permit 170.10.0.0 0.0.255.2551/26/96-Rev: A1.2Page 15Sam Halabi-cisco Systems
In the above example if a route matches the IP address 170.10.0.0 it willhave a metric of 2 and then we break out of the route map list. If thereis no match then we go down the route map list which says, set everythingelse to metric 5. It is always very important to ask the question, whatwill happen to routes that do not match any of the match statementsbecause they will be dropped by default.Example 2:Suppose in the above example we did not want AS100 to accept updatesabout 170.10.0.0. Since route maps cannot be applied on the inbound whenmatching based on an ip address, we have to use an outbound route map onRTC:RTC#router bgp 300network 170.10.0.0neighbor 2.2.2.2 remote-as 100neighbor 2.2.2.2 route-map STOPUPDATES outroute-map STOPUPDATES permit 10match ip address 1access-list 1 deny 170.10.0.0 0.0.255.255access-list 1 permit 0.0.0.0 255.255.255.255Now that you feel more comfortable with how to start BGP and how todefine a neighbor, let’s look at how to start exchanging networkinformation.There are multiple ways to send network information using BGP. I will gothrough these methods one by one.1/26/96-Rev: A1.2Page 16Sam Halabi-cisco Systems
7.0 Network commandThe format of the network command follows:network network-number [mask network-mask]The network command controls what networks are originated by this box.This is a different concept from what you are used to configuring withIGRP and RIP. With this command we are not trying to run BGP on a certaininterface, rather we are trying to indicate to BGP what networks itshould originate from this box. The mask portion is used because BGP4 canhandle subnetting and supernetting. A maximum of 200 entries of thenetwork command are accepted.The network command will work if the network you are trying to advertiseis known to the router, whether connected, static or learned dynamically.An example of the network command follows:RTA#router bgp 1network 192.213.0.0 mask 255.255.0.0ip route 192.213.0.0 255.255.0.0 null 0The above example indicates that router A, will generate a network entryfor 192.213.0.0/16. The /16 indicates that we are using a supernet of theclass C address and we are advertizing the first two octets (the first 16bits).Note that we need the static route to get the router to generate192.213.0.0 because the static route will put a matching entry in therouting table.1/26/96-Rev: A1.2Page 17Sam Halabi-cisco Systems
7.1 RedistributionThe network command is one way to advertise your networks via BGP.Another way is to redistribute your IGP (IGRP, OSPF, RIP, EIGRP, etc.)into BGP. This sounds scary because now you are dumping all of yourinternal routes into BGP, some of these routes might have been learnedvia BGP and you do not need to send them out again. Careful filteringshould be applied to make sure you are sending to the internet onlyroutes that you want to advertise and not everything you have. Let uslook at the example below.RTA is announcing 129.213.1.0 and RTC is announcing 175.220.0.0. Lookat RTC’s 175.220.0.0129.213.1.0AS200If you use a network command you will have:RTC#router eigrp 10network 175.220.0.0redistribute bgp 200default-metric 1000 100 250 100 1500router bgp 200neighbor 1.1.1.1 remote-as 300network 175.220.0.0 mask 255.255.0.0 (this will limit the networksoriginated by your AS to 175.220.0.0)If you use redistribution instead you will have:1/26/96-Rev: A1.2Page 18Sam Halabi-cisco Systems
RTC#router eigrp 10network 175.220.0.0redistribute bgp 200default-metric 1000 100 250 100 1500router bgp 200neighbor 1.1.1.1 remote-as 300redistribute eigrp 10 (eigrp will inject 129.213.1.0 again into BGP)This will cause 129.213.1.0 to be originated by your AS. This ismisleading because you are not the source of 129.213.1.0 but AS100 is.So you would have to use filters to prevent that network from beingsourced out by your AS. The correct configuration would be:RTC#router eigrp 10network 175.220.0.0redistribute bgp 200default-metric 1000 100 250 100 1500router bgp 200neighbor 1.1.1.1 remote-as 300neighbor 1.1.1.1 distribute-list 1 outredistribute eigrp 10access-list 1 permit 175.220.0.0 0.0.255.255The access-list is used to control what networks are to be originatedfrom AS200.1/26/96-Rev: A1.2Page 19Sam Halabi-cisco Systems
7.2 Static routes and redistributionYou could always use static routes to originate a network or a subnet.The only difference is that BGP will consider these routes as havingan origin of incomplete (unknown). In the above example the same couldhave been accomplished by doing:RTC#router eigrp 10network 175.220.0.0redistribute bgp 200default-metric 1000 100 250 100 1500router bgp 200neighbor 1.1.1.1 remote-as 300redistribute staticip route 175.220.0.0 255.255.255.0 null0The null 0 interface meanspacket and there is a moreof course) the router willwill disregard it. This isto disregard the packet. So if I get thespecific match than 175.220.0.0 (which existssend it to the specific match otherwise ita nice way to advertise a supernet.We have discussed how we can use different methods to originate routesout of our autonomous system. Please remember that these routes aregenerated in addition to other BGP routes that BGP has learned vianeighbors (internal or external). BGP passes on information that itlearns from one peer to other peers. The difference is that routesgenerated by the network command, or redistribution or static, willindicate your AS as the origin for these networks.Injecting BGP into IGP is always done by redistribution.1/26/96-Rev: A1.2Page 20Sam Halabi-cisco Systems
Example:150.10.0.0AS 100RTA160.10.0.0RTB150.10.20.1160.10.20.1AS 200150.10.20.2RTC170.10.0.0160.10.20.2AS 300RTA#router bgp 100neighbor 150.10.20.2 remote-as 300network 150.10.0.0RTB#router bgp 200neighbor 160.10.20.2 remote-as 300network 160.10.0.0RTC#router bgp 300neighbor 150.10.20.1 remote-as 100neighbor 160.10.20.1 remote-as 200network 170.10.00Note that you do not need network 150.10.0.0 or network 160.10.0.0 inRTC unless you want RTC to also generate these networks on top of passingthem on as they come in from AS100 and AS200. Again the difference isthat the network command will add an extra advertisement for these samenetworks indicating that AS300 is also an origin for these routes.An important point to remember is that BGP will not accept updates thathave originated from its own AS. This is to insure a loop freeinterdomain topology.1/26/96-Rev: A1.2Page 21Sam Halabi-cisco Systems
For example, assume AS200 above had a direct BGP connection into AS100.RTA will generate a route 150.10.0.0 and will send it to AS300, then RTCwill pass this route to AS200 with the origin kept as AS100, RTB willpass 150.10.0.0 to AS100 with origin still AS100. RTA will notice thatthe update has originated from its own AS and will ignore it.8.0 Internal BGPIBGP is used if an AS wants to act as a transit system to other ASs.You might ask, why can’t we do the same thing by learning via EBGPredistributing into IGP and then redistributing again into another AS?We can, but IBGP offers more flexibility and more efficient ways toexchange information within an AS; for example IBGP provides us with waysto control what is the best exit point out of the AS by using localpreference (will be discussed 20.1RTA175.10.40.2AS 0AS300170.10.0.0RTA#router bgp 100neighbor 190.10.50.1 remote-as 100neighbor 170.10.20.2 remote-as 300network 150
The Border Gateway Protocol (BGP), defined in RFC 1771, allows you to create loop free interdomain routing between autonomous systems. An autonomous system is a set of routers under a single technical administration. Routers in an AS can use multiple interior gateway protocols to e