Latency vs. bandwidth, pick low latency
05/19/15 Category: Performance
Most people incorrectly focus on bandwidth to improve performance, however, performance is a combination of latency and throughput.
Bandwidth is a measure of the theoretical maximum that a medium or connection can deliver. ISPs sell bandwidth, but we get some portion of that in the form of throughput. So, barring any compression or caching tricks, throughput is less than bandwidth. This is very apparent in wireless. While you might get a 3x3:3 AP (meaning 3 streams) and it has a bandwidth potential of 450 Mbps (3 x 150), the observed throughput might be 270 Mbps, 80 Mbps or far slower. It all depends on how many streams the client can support (most mobile devices are one or two streams), signal strength (highly dependent upon distance), signal quality, encoding techniques and background noise not to mention end to end limitations. You might be "happy" to get 20 Mbps of throughput from a 450 Mbps bandwidth capable device. So, focus on observed throughput and stop being fascinated by bandwidth alone.
Latency is a measure of speed and there are speed limits. In the case of digital communications with predominantly fiber media, the limit is the speed of light. Since light is so fast, we tend to measure latency (delay) in milliseconds (ms), which are one-thousandth of a second. So, 1 millisecond is .001 seconds. It may not seem like much, but it adds up quickly and is very noticeable in "faster" (higher bandwidth) circuits. We can easily measure ping between IP addresses by utilizing the ping command which returns the delay in ms for a round trip time (RTT). Lets use an analogy, lets imagine we're in a car with a speed limit of 60 mph and 300 miles. The RTT latency is 10 hours, notice this is round trip (five hours for each direction).
Packet transports make a difference in latency and throughput. UDP has less overhead in header bytes and processing time. Therefore, UDP is typically faster and has greater throughput. However, unlike TCP, UDP isn't reliable. It's quite effective for video and voice, but not appropriate for file transfer.
A packet contains both headers and payload. For most transmissions the MTU (maximum transmission unit) is 1500 bytes. If we go back to the car analogy, lets imagine the payload is passengers and we have a driver. In this case the unit of transfer is people and we can transfer 4 at a time where 3 are the payload. So if we wanted to transfer 6 people over 300 miles at 60 mph with one four person vehicle, the transfer time would be 20 hours (two 10 hour roundtrips). For 7 people, it's now 30 hours. Payload and round trip times affect throughput.
Everyone is focusing now on bandwidth, but the smart folks are looking more at latency. It's latency that affects the speed and "feel" of interactive transactions. We've all experienced this if you've gone from a spinning hard disk to a solid-state disk. While there can be a difference in transfer time (throughput), the big difference that we feel is IOPS (I/Os per second) as a result of reduced latency which is really important on random reads. Much like networking, transfer size matters to. So on a 512B transfer to/from disk, there can be a 500x difference in IOPS, while the throughput difference might be only be a difference of 2x to 3x. If you've experienced the speed difference in boot times and general use, you can tell the impact of latency versus throughput. Each IOPS transfers data, but because the latency is way smaller on SSD versus a spinning disk (it takes time for that platter to rotate around), you can get many more IOPS in the same amount of time and each IOP transfers a certain payload.
With higher speed networks, just like a long car trip, latency and round trip times wreak havoc. If a car can go 120 mph, but there's a speed limit of 60 mph, the speed limit and round trip time affects how many people you can get from point A to B. You can get a fast car, but it's all about payload and latency with single cars (like a single stream or session). In the data world, the speed limit is physics plain and simple.
With networks, RTT (Latency) and RCV buffers affect throughput for a single session/stream. For instance, if the observed RTT latency between two addresses is 20 ms, then the maximum throughput is 25 Mbps. Latency is constant, which means that as the pipe/ISP bandwidth increases, the throughput does not! Granted like a freeway, you can have multiple users/sessions/streams and move more traffic hence total aggregate throughput for all users can be higher, but a single user/connection will not get greater than 25 Mbps. At 40 ms, it's 12.5 Mbps. At 150 ms, it's 3.33 Mbps.
So, we have to stop focusing on bandwidth and sweat the small stuff (latency), it REALLY adds up. Of course many other factors come into play that can further erode throughput: memory buffer bloat, ISP peering problems, congestion, poor architecture, VPN fragmentation, duplex mismatch, speed mismatch, cable issues, QoS issues, etc.
When selecting ISPs, be realistic about how much more important latency is for throughput and don't get fascinated with bandwidth alone, focus on throughput as a result of latency instead. That's why we focus on enterprise solutions for networking and storage with a focus on low latency solutions for maximum throughput.