Measuring broadband speed – Part 1

At netBlazr speeds, measurement is difficult!

We’re deploying a network that supports data rates well in excess of 50 Mbps at almost every location, but no broadband speedtest reliably measures these sorts of speeds at least for the average user.  As a geek I can locate and work around most limitations and get a useful measurement, eventually.  But so far, no netBlazr member has been able to make a valid measurement the first time.

This is not a simple problem and it’s not necessarily the fault of the speed test websites, although there are things they could do to improve the situation. For a detailed academic analysis, see Understanding broadband speed measurements by Steve Bauer, David Clark & William Lehr (39 pages). For an account of netBlazr’s real world experiences, read on.

In the netBlazr network, the primary measurement problems have been:

  1. PC configuration issues
  2. Wireless router (Wi-Fi access point) behavior
  3. Network delay
  4. Test server limitations

Indeed, for my initial tests at one early installation (last April), I first used two network diagnostic programs that we use internal to the netBlazr network. One of these comes from Ubiquiti (our radio vendor) and the other is iperf an open source testing tool. In both cases I measured well over 80 Mbps of UDP throughput. Then the new netBlazr member plugged in their Wi-Fi access point, connected their laptop wirelessly and measured 2 Mbps!  Whoops…

We unplugged the Wi-Fi AP and connected their laptop with an Ethernet cable.  Now they measured 18 Mbps. They found this impressive (as they formerly had only Verizon DSL) but I was more than a little distressed. I plugged in my own PC and tested against a Boston-based speedtest server. I got 34-39 Mbps in a series of tests. Yet I knew the access network could support 85 Mbps.

To understand what was going on, we’ll need a network diagram that shows the important elements between a netBlazr member and services on the Internet.

Network diagram

and we’ll need to learn a little about the TCP protocol.

TCP or Transmission Control Protocol is the most widely used Internet protocol.  The word control is key.  TCP controls the rate at which a source transmits data in response to the network’s ability to carry the data and the destination’s ability to absorb it.  Because a control protocol can only react when it gets signals (in this case, from the network or from the destination), there are time lags.  So it’s not surprising we find TCP’s behavior is influenced by the round trip time (RTT) between the source and the destination.

In fact RTT is critical because each end needs buffers large enough to hold all the data for one round trip time.  The receiver needs to be able to absorb data when it arrives and the sender needs to keep a copy until receipt is acknowledged, in case a retransmission is required.

Our most common speed test problem is limited TCP buffers on the client computer. In TCP this buffer is called the “receive window.”  With Windows XP, the default was 16KB. With recent Macs the default is 64KB. But as this graph from Bauer et al based on Measurement Labs data shows, you can’t get 50 Mbps of TCP throughput with a 16 KB or 64 KB buffer if your RTT is greater than a few milliseconds.

TCP throughput limits

Referring back to the network diagram at the top of this article, the netBlazr network introduces just a few milliseconds (4-6 ms) of RTT between points 3 & 5. Speedtest.net currently has three servers in Boston, hosted by Comcast, DSCI and Towerstream respectively. Measuring at point 3 in the network diagram, the Towerstream-hosted server is closest at 4-10 ms RTT and the Comcast-hosted server is farthest away at at 16-20 ms RTT. Measurement Labs doesn’t have a local server in Boston, so RTT varies depending on the M-Labs server you get connected with. I’ve never seen an M-Labs RTT below 20 ms, with 30-40 ms more typical.

What happened in April?

Unfortunately I didn’t record enough data to be certain at this late date but, based on subsequent experiments (which I’ll discuss in a follow-on article), it appears the difference between 18 Mbps on one laptop and 34-39 Mbps on another laptop is at least partly due to different RTTs because we connected to speedtest servers on different networks (Towerstream vs Comcast). There may also have been a difference in TCP receive windows, but I don’t have that data.  My laptop was a MacBook Pro with OS X 10.6.7 and the default TCP window of 64 KB.  I don’t know the OS or buffer size on other laptop back in April. Obviously, if its receive window was 32KB or 16 KB, that in and of itself could account for the difference in measurements.

Stay tuned for part 2 (hopefully in a matter of days)…

1 Comment

  1. Brough on August 3, 2011 at 8:41 am

    Here’s a useful dialog about this post that happened over on Google Plus:

    Herman Wagter – Brough, does this suggest that pure maximum data throughput is aided by bigger buffers? While at the same type we worry about bufferbloat as causing instability? Or that the right buffersize depends (dynamically) on RTT and size of the stream?
    6:42 AM

    Brough Turner – No. You have to separate the buffers that are the TCP receive window from the buffers in routers, switches and the endpoints’ I/O drivers. The former are in support of, but outside of, the TCP control loop. The latter are within the TCP control loop and thus add to the RTT if they are large (so-called buffer bloat) and thus fill up during congestion (preventing the sending of a congestion signal, i.e. a dropped packet).
    8:37 AM