BGP Monitoring RFC 7854

https://tools.ietf.org/html/rfc7854

   This document defines the BGP Monitoring Protocol (BMP), which can be
   used to monitor BGP sessions.  BMP is intended to provide a
   convenient interface for obtaining route views.  Prior to the
   introduction of BMP, screen scraping was the most commonly used
   approach to obtaining such views.  The design goals are to keep BMP
   simple, useful, easily implemented, and minimally service affecting.
   BMP is not suitable for use as a routing protocol.

What is a BGP Confederation?

In network routing, BGP confederation is a method to use Border Gateway Protocol (BGP) to subdivide a single autonomous system (AS) into multiple internal sub-AS’s, yet still advertise as a single AS to external peers. This is done to reduce the number of entries in the iBGP routing table.  If you are familiar with breaking OSPF domains up into areas, BGP confederations are not that much different, at least from a conceptual view.

And, much like OSPF areas, confederations were born when routers had less CPU and less ram than they do in today’s modern networks. MPLS has superseded the need for confederations in many cases. I have seen organizations, who have different policies and different admins break up their larger networks into confederations.  This allows each group to go their own directions with routing policies and such.

if you want to read the RFC:https://tools.ietf.org/html/rfc5065

UBNT EDGEMAX 1.10.3 update route flushing

From UBNT:

New features:

Offloading – Add CLI commands to disable flow-table flushing in offloading engine when routing table changes:  set system offload ipv4 disable-flow-flushing-upon-fib-changes set system offload ipv6 disable-flow-flushing-upon-fib-changes
Discussed here

Prior to 1.10.3 firmware flow-table in offloading engine was always flushed when route was updated in linux routing table. Flow flushing ensured that offloading engine got routing updates instantly but it wasted a lot of CPU time and decreased performance if routing table was constantly updated for (instance in Full BGP, big OSPF or flapping PPPoE interface scenarios)

In 1.10.3 firmware by default disable-flow-flushing-upon-fib-changes is not set which means that flow table in offloading engine is always flushed upon routing table changes same way as it used to be in previous firmware.
 
If you have Full-BGP table or large OSPF network they you are advised to set disable-flow-flushing-upon-fib-changes this will ensure less CPU-load and increase max throughput.
 
Important note for multi-WAN environments – if nexthop interface of default-gateway changes and disable-flow-flushing-upon-fib-changes is set then it will take up to flow-lifetime seconds before all existing offloaded flows switch to new nexthop interface (up to 12 seconds by default).
  Offloading – Add CLI command to modify flow-lifetime in offloading engine (expressed in seconds): 
set system offload flow-lifetime 24Prior to 1.10.3 firmware flow-lifetime parameter was hardcoded and was not synchronized between different ER platforms: 12 seconds on ER-Lite/ER-Poe, 6 seconds on ER/ER-pro/ER-4/ER-6 and 3 seconds on ER-Infinity. 

In 1.10.3 firmware default value of flow-lifetime is set to 12 seconds for all ER platforms and now it can be modified. By modifying flow-lifetime parameter you control how much traffic skips from offloading engine into linux network stack.

If you increase flow-lifetime then:
 a) Offloaded IP flows will expire less frequently and less packets will be forwarded to linux
 b) CPU load will decrease and max throughput will increase
 c) if disable-flow-flushing-upon-fib-changes parameter is set then it will take more time for offloading engine to detect changes in routing table 
 
If you decrease flow-lifetime then:
 a) Offloaded IP flows will expire more frequently and more packets will be forwarded to linux
 b) CPU load will increase and max throughput will decrease
 c) if disable-flow-flushing-upon-fib-changes parameter is set then it will take less time for offloading engine to detect changes in routing table 
  Offloading – add CLI command to show flows in offloading engine: show ubnt offload flows Offloading – add CLI command to show offloading engine statistics: show ubnt offload statistics

 

Enhancements and bug fixes:

LDP – fixed regression in 1.10.0 when LDP configuration failed. Discussed here LoadBalancing – fixed regression in 1.10.1 when LoadBalancing failed to recover if WAN interface lost&restored link in 3 second interval. Discussed here DHCP – fixed bug when DHCP server configuration failed to commit with networks other than /8, /16, and /24. Discussed here TrafficControl – fixed regression in 1.10.0 when “command not found” output was printed when running “show traffic-control …” commands. Discussed here

Lab Network

I am starting an ongoing series involving a semi-static set of devices.  These will involve different tutorials on things such as OSPF, cambium configuration, vlans, and other topics.  Below is the general topology I will use for this lab network.  As things progress I will be able to swap different manufacturers and device models into this scenario without changing the overall topology.  We may add a device or two here and there, but overall this basic setup will remain the same.  This will allow you to see how different things are configured in the same environment without changing the overall scheme too much.

We will start with very basic steps.  How to login to the router, how to set an IP address, then we will move to setting up a wireless bridge between the two routers.  Once we have that done we will move onto setting up OSPF to enable dynamic routing.  After that the topics are open.  I have things like BGP planned, and some other things. If there is anything you would like to see please let me know.

The problem with peering from a logistics standpoint

Many ISPs run into this problem as part of their growing pains.  This scenario usually starts happening with their third or 4th peer.

Scenario.  ISP grows beyond the single connection they have.  This can be 10 meg, 100 meg, gig or whatever.  They start out looking for redundancy. The ISP brings in a second provider, usually at around the same bandwidth level.  This way the network has two pretty equal paths to go out.

A unique problem usually develops as the network grows to the point of peaking the capacity of both of these connections.  The ISP has to make a decision. Do they increase the capacity to just one provider? Most don’t have the budget to increase capacities to both providers. Now, if you increase one you are favouring one provider over another until the budget allows you to increase capacity on both. You are essentially in a state where you have to favor one provider in order to keep up capacity.  If you fail over to the smaller pipe things could be just as bad as being down.

This is where many ISPs learn the hard way that BGP is not load balancing. But what about padding, communities, local-pref, and all that jazz? We will get to that.  In the meantime, our ISP may have the opportunity to get to an Internet Exchange (IX) and offload things like streaming traffic.  Traffic returns to a little more balance because you essentially have a 3rd provider with the IX connection. But, they growing pains don’t stop there.

As ISP’s, especially WISPs, have more and more resources to deal with cutting down latency they start seeking out better-peered networks.  The next growing pain that becomes apparent is the networks with lots of high-end peers tend to charge more money.  In order for the ISP to buy bandwidth they usually have to do it in smaller quantities from these types of providers. This introduces the probably of a mismatched pipe size again with a twist. The twist is the more, and better peers a network has the more traffic is going to want to travel to that peer. So, the more expensive peer, which you are probably buying less of, now wants to handle more of your traffic.

So, the network geeks will bring up things like padding, communities, local-pref, and all the tricks BGP has.  But, at the end of the day, BGP is not load balancing.  You can *influence* traffic, but BGP does not allow you to say “I want 100 megs of traffic here, and 500 megs here.”  Keep in mind BGP deals with traffic to and from IP blocks, not the traffic itself.

So, how does the ISP solve this? Knowing about your upstream peers is the first thing.  BGP looking glasses, peer reports such as those from Hurricane Electric, and general news help keep you on top of things.  Things such as new peering points, acquisitions, and new data centers can influence an ISPs traffic.  If your equipment supports things such as netflow, sflow, and other tools you can begin to build a picture of your traffic and what ASNs it is going to. This is your first major step. Get tools to know what ASNs the traffic is going to   You can then take this data, and look at how your own peers are connected with these ASNs.  You will start to see things like provider A is poorly peered with ASN 2906.

Once you know who your peers are and have a good feel on their peering then you can influence your traffic.  If you know you don’t want to send traffic destined for ASN 2906 in or out provider A you can then start to implement AS padding and all the tricks we mentioned before.  But, you need the greater picture before you can do that.

One last note. Peering is dynamic.  You have to keep on top of the ecosystem as a whole.