The problem with peering from a logistics standpoint

Many ISPs run into this problem as part of their growing pains.  This scenario usually starts happening with their third or 4th peer.

Scenario.  ISP grows beyond the single connection they have.  This can be 10 meg, 100 meg, gig or whatever.  They start out looking for redundancy. The ISP brings in a second provider, usually at around the same bandwidth level.  This way the network has two pretty equal paths to go out.

A unique problem usually develops as the network grows to the point of peaking the capacity of both of these connections.  The ISP has to make a decision. Do they increase the capacity to just one provider? Most don’t have the budget to increase capacities to both providers. Now, if you increase one you are favouring one provider over another until the budget allows you to increase capacity on both. You are essentially in a state where you have to favor one provider in order to keep up capacity.  If you fail over to the smaller pipe things could be just as bad as being down.

This is where many ISPs learn the hard way that BGP is not load balancing. But what about padding, communities, local-pref, and all that jazz? We will get to that.  In the meantime, our ISP may have the opportunity to get to an Internet Exchange (IX) and offload things like streaming traffic.  Traffic returns to a little more balance because you essentially have a 3rd provider with the IX connection. But, they growing pains don’t stop there.

As ISP’s, especially WISPs, have more and more resources to deal with cutting down latency they start seeking out better-peered networks.  The next growing pain that becomes apparent is the networks with lots of high-end peers tend to charge more money.  In order for the ISP to buy bandwidth they usually have to do it in smaller quantities from these types of providers. This introduces the probably of a mismatched pipe size again with a twist. The twist is the more, and better peers a network has the more traffic is going to want to travel to that peer. So, the more expensive peer, which you are probably buying less of, now wants to handle more of your traffic.

So, the network geeks will bring up things like padding, communities, local-pref, and all the tricks BGP has.  But, at the end of the day, BGP is not load balancing.  You can *influence* traffic, but BGP does not allow you to say “I want 100 megs of traffic here, and 500 megs here.”  Keep in mind BGP deals with traffic to and from IP blocks, not the traffic itself.

So, how does the ISP solve this? Knowing about your upstream peers is the first thing.  BGP looking glasses, peer reports such as those from Hurricane Electric, and general news help keep you on top of things.  Things such as new peering points, acquisitions, and new data centers can influence an ISPs traffic.  If your equipment supports things such as netflow, sflow, and other tools you can begin to build a picture of your traffic and what ASNs it is going to. This is your first major step. Get tools to know what ASNs the traffic is going to   You can then take this data, and look at how your own peers are connected with these ASNs.  You will start to see things like provider A is poorly peered with ASN 2906.

Once you know who your peers are and have a good feel on their peering then you can influence your traffic.  If you know you don’t want to send traffic destined for ASN 2906 in or out provider A you can then start to implement AS padding and all the tricks we mentioned before.  But, you need the greater picture before you can do that.

One last note. Peering is dynamic.  You have to keep on top of the ecosystem as a whole.

Premium Content?

Folks,

I have been going over the idea of premium content within my blogs. I find myself wanting to write more and more, and writing would help one of my other projects I have going.  However, I don’t make any direct money from the blog.

In an effort to provide more regular content, I have come up with the following ideas.

1.I will be implementing a premium posts section of this blog.  For the foreseeable future, this will be a free section. All that will be required is you fill out a simple registration.  We won’t spam you, but you will have the option to be notified of updates.

2.I will be doing more sponsored posts to keep as much content as possible in the non-premium space.  As a result, I am looking for vendors with products they would like reviewed.  The idea is if I buy a product with my own money, it’s most likely going to be a premium post.  If a vendor or manufacturer wish to send a product to be reviewed this will be a public post.

I am looking for vendors and manufacturers that wish to be regular sponsors. This strategy is open and fluid as things progress.

Product Review: Sync Stop

As I get ready for my trip to Vegas to attend WISPAPALLOZA 2017 the following product becomes relevant.  Security, namely Identity Theft, is becoming more and more of something we have to deal with.  Much like pickpockets, digital Identity theft is a real thing.

This is where the SyncStop by Xipiter comes in.  This is a simple device.  It allows you to charge your phone on any USB enabled connection, but does not allow syncing by cutting off access to the data pins of the USB connection at the hardware level.

If you travel alot I would suggest investing in a few of these.  Let’s face it, we try and find an outlet anywhere we can when it comes to charging our phones.  Hackers know this.  A cleverly designed “public charge station” could be easily compromised to feed your data back to a remote server in just a few minutes and you would probably never notice.

WPA is not encrypting your customer traffic

There was a Facebook discussion that popped up tonight about how a WISP answers the question “Is your network secure?” There were many good answers and the notion of WEP vs WPA was brought up.

In today’s society, you need end-to-end encryption for data to be secure. An ISP has no control over where the customer traffic is going. Thus, by default, the ISP has no control over customer traffic being secure.  “But Justin, I run WPA on all my aps and backhauls, so my network is secure.”  Again, think about end-to-end connectivity. Every one of your access points can be encrypted, and every one of your backhauls can be encrypted, but what happens when an attacker breaks into your wiring closet and installs a sniffer on a router or switch port?What most people forget is that WPA key encryption is only going on between the router/ap and the user device.  “But I lock down all my ports.” you say.  Okay, what about your upstream? Who is to say your upstream provider doesn’t have a port mirror running that dumps all your customer traffic somewhere.  “Okay, I will just run encrypted tunnels across my entire network!. Ha! let’s see you tear down that argument!”. Again, what happens when it leaves your network?  The encryption stops at the endpoint, which is the edge of your network.

Another thing everyone hears about is hotspots. Every so often the news runs a fear piece on unsecured hotspots.  This is the same concept.  If you connect to an unsecured hotspot, it is not much different than connecting to a hotspot where the WPA2 key is on a sign behind the cashier at the local coffee shop. The only difference is the “hacker” has an easier time grabbing any unsecured traffic you are sending. Notice I said unsecured.  If you are using SSL to connect to a bank site that session is sent over an encrypted session.  No sniffing going on there.  If you have an encrypted VPN the possibility of traffic being sniffed is next to none. I say next to none because certain types of VPNs are more secure than others. Does that mean the ISP providing the Internet to feed that hotspot is insecure? There is no feasible way for the ISP to provide end to end security of user traffic on the open Internet.

These arguments are why things like SSL and VPNs exist. Google Chrome is now expecting all websites to be SSL enabled to be marked as secure. VPNs can ensure end-to-end security, but only between two points.  Eventually, you will have to leave the safety and venture out into the wild west of the internet.  Things like Intranets exist so users can have access to information but still be protected. Even most of that is over encrypted SSL these days so someone can’t install a sniffer in the basement.

So what is a WISP supposed to say about security? The WISP is no more secure than any other ISP, nor are then any less secure.  The real security comes from the customer. Things like making sure their devices are up-to-date on security patches.  This includes the often forgotten router. Things like secure passwords, paying attention to browser warnings, e-mail awareness, and other things are where the real user security lies. VPN connections to work. Using SSL ports on e-mail. Using SSH and Secure RDP for network admins. Firewalls can help, but they don’t encrypt the traffic. Does all traffic need encrypted? no.

Everything you wanted to know about NTP

Network Time Protocol (NTP) is a service that can be used to synchronize time on network connected devices.   Before we dive into what NTP is, we need to understand why we need accurate time.

The obvious thing is network devices need an accurate clock.  Things like log files with the proper time stamp are important in troubleshooting.  Accurate timing also helps with security prevention measures.  Some attacks use vulnerabilities in time stamps to add in bad payloads or manipulate data. Some companies require accurate time stamps on files and transactions as well for compliance purposes.

So what are these Stratum levels I hear about?
NTP has several levels divided into stratum. All this is the distance from the reference clock source.  A clock which relays UTC (Coordinated Universal Time) that has little to no delay (we are talking nanoseconds) are Stratum-0 servers. These are not used on the network. These are usually atomic and GPS clocks.  A Stratum-0 server is connected to time servers or stratum-1 via GPS or a national time and frequency transmission.  A Stratum 1 device is a very accurate device and is not connected to a Stratum-0 clock over a network.  A Stratum-2 clock receives NTP packets from a Stratum-1 server, a Stratum-3 receives packets from a Stratum-2 server, and so on.  It’s all relative of where the NTP is in relationship to Stratum-1 servers.

Why are there levels?
The further you get away from Stratum-0 the more delay there is.  Things like jitter and network delays affect accuracy.  Most of us network engineers are concerned with milliseconds (ms) of latency.  Time servers are concerned with nanoseconds (ns). Even a server directly connected to a Stratum-0 reference will add 8-10 nanoseconds to UTC time.

My Mikrotik has an NTP server built in? Is that good enough?
This depends on what level of accuracy you want. Do you just need to make sure all of your routers have the same time? then synchronizing with an upstream time server is probably good enough. Having 5000 devices with the same time, AND not having to manually set them or keep them in sync manually is a huge deal.

Do you run a VOIP switch or need to be compliant when it comes to transactions on servers or need to be compliant with various things like Sox compliance you may need a more accurate time source.

What can I do for more accurate time?
Usually, a dedicated appliance is what many networks use.  These are purpose built hardware that receives a signal from GPS. the more accurate you need the time, the more expensive it will become.  Devices that need to be accurate to the nanosecond are usually more expensive than ones accurate to a microsecond.

If you google NTP Appliance you will get a bunch of results.  If you want to setp up from what you are doing currently you can look into these links:

http://www.satsignal.eu/ntp/Raspberry-Pi-NTP.html

How to Build a Stratum 1 NTP Server Using A Raspberry Pi

 

Building a Stratum 1 NTP Server with a Raspberry Pi

 

The problem with speedtests

Imagine this scenario. Outside your house, the most awesome super highway has been built.  It has a speed limit of 120 Mile Per Hour.  You calculate at those speeds you can get to and from work 20 minutes earlier. Life is good.  Monday morning comes, you hop in your Nissan GT-R, put on some new leather driving gloves, and crank up some good driving music.  Your pull onto the dedicated on-ramp from your house and are quickly cruising at 120 Miles an hour. You make it into work before most anyone else. Life is good.  

Near the end of the week, you notice more and more of your neighbours and co-workers using this new highway.  Things are still fast, but you can’t get up to speed to work like you could earlier in the week.  As you ponder why you notice you are coming up on the off-ramp to your work.  Traffic is backed up. Everyone is trying to get to the same place.  As you are waiting in the line to get off the super highway, you notice folks passing you by going on down the road at high rates of speed.  You surmise your off-ramp must be congested because it is getting used more now.

Speedtest servers work the same way. A speedtest server is a destination on the information super-highway. Man, there is an oldie term.  To understand how speedtest servers work we need a quick understanding of how the Internet works.   The internet is basically a bunch of virtual cities connected together.  Your local ISP delivers a signal to you via Wireless, Fiber, or some sort of media. When it leaves your house it travels to the ISP’s equipment and is aggregated with your neighbours and sent over faster lines to larger cities. It’s just like a road system. You may get access via a gravel road, which turns into a 2 lane blacktop, which then may turn into a 4 lane highway, and finally a super-highway.  The roads you take depend on where you are going. Your ISP may not have much control over how the traffic flows once it leaves their network.

Bottlenecks can happen anywhere. Anything from fiber optic cuts, oversold capacity, routing issues, and plain old unexpected usage. Why are these important? All of these can affect your speedtest results and can be totally out of control of your ISP and you.  They can also be totally your ISP’s fault. They can also be your fault, just like your car can be.  An underpowered router can be struggling to keep up with your connection. Much like a moped on the above super-highway can’t keep up with a 650 horsepower car to fully utilize the road, your router might not be able to keep up either.  Other things can cause issues such as computer viruses, and low performing components.

Just about any network can become a speedtest.net node or a node with some of the other speedtest sites.  These networks have to meet minimum requirements, but there is no indicator of how utilized these speedtest servers are.  A network could put up one and it’s 100 percent utilized when you go running a speedtest. This doesn’t mean your ISP is slow, just the off-ramp to that speedtest server is slow.

The final thing we want to talk about is the utilization of your internet pipe from your ISP.  This is something most don’t take into consideration.  Let’s go back to our on-ramp analogy.  Your ISP is selling you a connection to the information super-highway.   Say they are selling you a 10 megabyte download connection.  If you have a device in your house streaming an HD Netflix stream, which is typically 5 megs or so, that means you only have 5 megs available for a speedtest while that HD stream is happening. Speedtest only test your current available capacity.  Many folks think a speedtest somehow stops all the traffic on your network, runs the test, and starts the traffic. It doesn’t work that way. A speedtest tests the available capacity at that point in time.  The same is true for any point between you and the speedtest server.  Remember our earlier analogy about slowing down when you got to work because there were so many people trying to get there.  They exceeded the capacity of that destination.  However, that does not mean your connection is necessarily slow because people were zooming past you on their way to less congested destinations.

This is why speedtest results should be taken with a grain of salt. They are a useful tool, but not an absolute. A speedtest server is just a destination.  That destination can have bottlenecks, but others don’t.  Even after this long article, there are many other factors which can affect Internet speed. Things we didn’t touch on like Peering, the technology used, speed limits, and other things can also affect your internet speed to destinations.

Some Random Visio diagram

Below, We have some visio diagrams we have done for customers.

This first design is a customer mesh into a couple of different data centers. We are referring to this as a switch-centric design. This has been talked about in the forums and switch-centric seems like as good as any.

This next design is a netonix switch and a Baicells deployment.

Design for a customer