SUNET Internet2 Land Speed Record: 122.367 Pbmps (multiple stream).

From San Jose, CA, USA to Luleå, Sweden
Swedish University Network
Börje Josefsson 2004-04-14
Background:
SUNET is the organization for the national higher research and education network (NREN) of Sweden. SUNET operates the GigaSunet network, which is built with 10 Gbit/sec DWDM connections in a redundant infrastructure, connecting PoPs in 22 cities, nationwide, and using redundant 2,5 Gbit/sec connections as access towards the universities. It is used by researchers, teachers, students, and administrative personnel on 32 universities and colleges nationwide. In addition to this, some central government museums and external organizations are also connected to the network.

Internet Land Speed Record:
On September 12, 2004, SUNET transferred around 492 Gigabytes of data in about 16.5 minutes, using multiple TCP streams between one host at the Luleå University of Technology (LTU) in Sweden (close to the Arctic circle), and one host connected to a Sprint PoP in San Jose, CA, USA. The network path used is the GigaSunet backbone - shared with other users of the Swedish universites, and the SprintLink core network, used by all the customers of Sprintlink.

The transfer was done with the iperf program, available for many different platforms. We have chosen to use NetBSD for our tests, due to the scalability of the TCP code.

Network setup:

Network path:


The path spans across two continents, Europe and the US, as shown in this picture:
traceroute to 130.242.94.246 (130.242.94.246), 64 hops max, 40 byte packets 1 sl-bbh-sj.sprintlink.net (198.67.129.33) 2.554 ms 2 sl-bbh-stk-1-0.sprintlink.net (144.232.8.30) 3.927 ms 3 sl-bb22-stk-13-0.sprintlink.net (144.232.4.45) 2.362 ms 4 sl-bb21-stk-15-0.sprintlink.net (144.232.4.241) 2.496 ms 5 sl-bb22-kc-2-0.sprintlink.net (144.232.20.168) 42.365 ms 6 sl-bb21-kc-6-0.sprintlink.net (144.232.2.133) 42.429 ms 7 sl-bb26-fw-13-0.sprintlink.net (144.232.8.63) 53.072 ms 8 sl-bb27-fw-15-0.sprintlink.net (144.232.11.86) 53.176 ms 9 sl-bb20-pen-13-0.sprintlink.net (144.232.8.64) 103.180 ms 10 sl-bb21-pen-14-0.sprintlink.net (144.232.16.34) 103.314 ms 11 sl-bb23-rly-0-0.sprintlink.net (144.232.20.32) 105.730 ms 12 sl-bb27-rly-10-0.sprintlink.net (144.232.14.142) 105.512 ms 13 sl-bb23-chi-12-0.sprintlink.net (144.232.20.184) 125.224 ms 14 sl-bb24-chi-15-0.sprintlink.net (144.232.26.101) 125.547 ms 15 sl-bb25-nyc-5-0.sprintlink.net (144.232.9.157) 148.753 ms 16 sl-bb24-nyc-10-0.sprintlink.net (144.232.13.181) 148.658 ms 17 sl-bb23-nyc-4-0.sprintlink.net (144.232.13.169) 148.627 ms 18 sl-bb20-par-11-0.sprintlink.net (144.232.20.44) 219.872 ms 19 sl-bb21-fra-13-0.sprintlink.net (213.206.129.66) 369.752 ms 20 sl-bb20-fra-15-0.sprintlink.net (217.147.96.33) 229.642 ms 21 sl-bb21-ham-14-0.sprintlink.net (213.206.129.62) 403.593 ms 22 sl-bb20-ham-15-0.sprintlink.net (217.147.96.45) 239.534 ms 23 sl-bb21-ams-14-0.sprintlink.net (213.206.129.49) 254.658 ms 24 sl-bb20-ams-15-0.sprintlink.net (217.149.32.33) 245.388 ms 25 sl-bb21-bru-14-0.sprintlink.net (213.206.129.46) 248.710 ms 26 sl-bb20-bru-15-0.sprintlink.net (80.66.128.41) 261.456 ms 27 sl-bb22-lon-13-0.sprintlink.net (213.206.129.41) 456.846 ms 28 sl-bb23-lon-15-0.sprintlink.net (213.206.128.161) 253.476 ms 29 sl-bb21-lon-13-0.sprintlink.net (213.206.128.55) 390.525 ms 30 sl-bb21-tuk-10-0.sprintlink.net (144.232.19.69) 322.420 ms 31 sl-bb20-tuk-15-0.sprintlink.net (144.232.20.132) 322.506 ms 32 sl-bb20-msq-10-0.sprintlink.net (144.232.20.172) 324.196 ms 33 sl-bb21-msq-15-0.sprintlink.net (144.232.9.110) 324.173 ms 34 sl-bb20-cop-14-0.sprintlink.net (144.232.19.30) 401.493 ms 35 sl-bb21-olo-13-0.sprintlink.net (213.206.129.72) 409.121 ms 36 sl-bb20-olo-15-0.sprintlink.net (80.77.104.32) 409.234 ms 37 sl-bb20-sto-14-0.sprintlink.net (213.206.129.30) 416.407 ms 38 sl-tst1-sto-0-0.sprintlink.net (213.206.131.10) 416.737 ms 39 stockholm1.POS14.sunet.se (130.242.94.221) 416.692 ms 40 vasteras1-pos4.sunet.se (130.242.82.10) 418.342 ms 41 gavle1-pos4.sunet.se (130.242.81.49) 421.155 ms 42 lulea1-pos0.sunet.se (130.242.81.42) 436.836 ms 43 dino.dc.ltu.se (130.242.94.246) 436.656 ms

PING 130.242.94.246 (130.242.94.246): 56 data bytes 64 bytes from 130.242.94.246: icmp_seq=0 ttl=213 time=436.593 ms 64 bytes from 130.242.94.246: icmp_seq=1 ttl=213 time=436.647 ms 64 bytes from 130.242.94.246: icmp_seq=2 ttl=213 time=436.626 ms

----130.242.94.246 PING Statistics----

3 packets transmitted, 3 packets received, 0.0% packet loss round-trip min/avg/max/stddev = 436.593/436.622/436.647/0.027 ms

All routers in the path are Cisco high-end routers, including two of the new CRS-1 routers. Note that this is a path shared with other users of the GigaSunet and Sprint networks! The following graph shows one of the links in the path during the day (several transmissions were done) - showing the record-traffic shared with the normal usage.

Results:


According to the Internet2 LSR contest rule #5A, IPv4 TCP single stream, we achieved the following results, using the upcoming 2.0 version of the NetBSD operating system, and using a MTU of 4470 bytes:

4.92 Gbytes in 1001 seconds = 4222 Mbit/sec

The complete output from iperf during the transmission as seen from the transmitter and the receiver. The test run lasted 1001 seconds (16 minutes and 41 seconds). Note that this means that we didn't loose a single bit on this path for the duration of the transmission!
A tcpdump output is available for the first few Mbytes of the transmission, both in raw tcpdump format and as readable tcpdump output.

Internet2 Land speed record submission:


According to contest rule #7, the distance should be calculated as the terrestrial distance between the cities where we do router hops. Referring to the Great Circle Mapper, the distance is 28,983 km (18,013 miles). We have then used the airport of the city in question as it's location.

Record submitted for the IPv4 multiple stream class is 122.367 Petabit-meters/second (which is a 17% increase of the previous record).

Most notable is perhaps that our result was achieved on the normal GigaSunet and Sprintlink production infrastructures, shared by millions of other users of those networks.

End system hardware and configuration:


The end hosts are off-the-shelf Dell PC:s (see details below), each with only a single Intel Xeon 2.0/2.8 GHz processor, 1024/512 Mbyte of RAM (sender/receiver), and using the Intel PRO/10GbE LR network adapters. Note that theese hosts are fairly modest in performance compared to any top-of-the line server of today, which makes this record even more impressive!
NetBSD operating system configuration (apart from default settings):

Kernel compile-time parameters:

  • options NMBCLUSTERS=8192 # Increase number of network buffers.
  • options MAX_KMAPENT=3000 # Need more kmap entries due to extensive use of kernel virtual memory
  • DGE_BUFFER_SIZE=8192 # Size of NIC recived pages used in private pool
dge* at pci? dev ? function ? # Intel PRO/10GbE network adapter

Sysctl parameters:

  • net.inet.tcp.init_win=131000 # Tune TCP startup time
  • kern.sbmax=300000000 # Max memory a socket can use, 300MB
  • kern.somaxkva=300000000 # Max memory for all sockets together, 300MB
  • net.inet.tcp.sendspace=250000000 # Size of transmit window, 250MB
  • net.inet.tcp.recvspace=250000000 # Size of receive window, 250MB
  • net.inet.ip.ifq.maxlen=20000 # Max length of interface queue

Ifconfig settings:
ifconfig dge0 10.0.0.1/30 ip4csum tcp4csum udp4csum link0 link1 mtu 4470 up

  • ip4csum, tcp4csum, udp4csum # Enable hardware checksums
  • link0, link1 # Set PCI-X burst size to 4k.
Observation: We noted that it is the PC hardware (excluding the Intel PRO/10GbE network adapter) that is the limiting factor in our setup. The operating system, the network adapter, as well as the network itself, including the routers, are capable of handling more traffic than this, but the PCI-X bus and the memory bandwith in the end hosts are currently the bottlenecks.

Summary, according to Internet2 standards:
12 September 2004

  • Record Set: IPv4 Multiple Stream
  • I2-LSR Record: 122.367 petabit-meters/second
  • Team Members
SUNET (Swedish University Network)
Sprint
  • Network Distance: 28,983 kilometers
  • Data transferred: 492 Gigabytes (528280977408 bytes)
  • Time: 1001 seconds
  • Software notes:
operating system: NetBSD, upcoming version 2.0.
application: iperf
  • Hardware notes:
Sender:
Dell 2650, with one single Intel Xeon 2.0 GHz CPU and 1024 Mbytes of RAM
Receiver:
Dell Precision 650, with one single Intel Xeon 2.8 GHz CPU and 512 Mbytes of RAM. NOTE that this host only has a 100 MHz PCI-X bus(!)
Network interfaces (both sender and receiver): Intel® PRO/10GbE LR

Special thanks to:

Peter Löthberg for ideas, help, debugging, US coordination etc. etc. His help has been invaluable!

Thanks to:


The Sprint staff in San Jose, Reston and Stockholm, for outstanding support and help.

Also special thanks to Sprint for providing bandwidth, and for facilities and housing for the host in San Jose!

Contact information:
Hans Wallberg CEO, SUNET Hans.Wallberg@sunet.se
Börje Josefsson CTO, SUNET. LSR test coordniator bj@sunet.se
Peter Löthberg Sprintlink LSR coordinator roll@sprint.net
Anders Magnusson LSR technical test manager. ragge@ltu.se

NOTE WELL - This is a historical archive. Contents is no longer being maintained.