LinuxSA Mailing list archives

Index: [thread] [date] [subject] [author] [stats]
  From: Glen Turner <glen.turner@aarnet.edu.au>
  To  : John Edwards <isplist@adam.com.au>
  Date: Mon, 14 Jul 2003 14:07:59 +0930

Re: Anyone going to meeting with laptop with GbE?

John Edwards wrote:

> Sounds interesting, is there a web site somewhere detailing the project 
> and the problems it needs to solve?

Not yet, it's on the todo list.

Basically the problem it is that the current
ethernet MTU is too small.  It really should be 64KB,
but IEEE won't do this as they want interop from GbE
to thick coax.

The second problem is that IPv4 path MTU discovery is
broken.  On fat pipe with big frame sizes your can't
have a range of MTUs and have good TCP performance.
I've got models for this, but rather than inflict the
math just consider that:

  - if a GbE host uses a MTU 10% lower than the real MTU
    then it is blowing a fast ethernet's worth of bandwidth.
    That's what machines do today if they implment IPv4's
    "plateau" path MTU algorithm.

  - if a GbE host uses the more obvious "binary search"
    algorithm to discover the MTU then finding the last
    5% of the bandwidth leads to a 50% packet loss.  TCP
    hates that, so the TCP performance hit can be worse
    than the correct MTU performance gain.

So a sneaky, complex algorithm tuned to TCP is needed
to discover the Path MTU.  I've got one of these, but
getting it deployed probably won't happen without a lot
of work, as vendors need a high degree of assurance than
the algorithm won't fail in some obscure way.

A lot of switch and router vendors support "jumbo frames",
these are 9000 bytes.  With that installed base, an
alternative approach is for ISPs to decide that 9000
will be the new gigabit MTU and deploy it where possible.
The ISP having a bigger MTU on their backbone doesn't hurt
existing normal-MTU ethernet users.

Then the 9000 value can be added as a new plateau to the
"plateau" path MTU discovery algorithm used in Win, BSD,
Linux.

Then as user networks move to the bigger MTU, at least
for machines doing serious file transfer, then the
infrastructure for bigger end-to-end MTU will be in place.

I'd expect big MTU to be a differentiation point for gigabit
ISP networks.  A host running a file tranbsfer on 1500-byte
MTU GbE will get about 690Mbps, a host running 9000-byte MTU
ethernet will get 1Gbps.  So for users with file transfer
applications there's a significant performance difference.

I'm also working on making DHCP do the right thing for
plug-and-play operation of GbE networks with large MTUs:

   - allow the DHCP response to set the MTU

   - hosts which are asked to set a big MTU and can't
     (eg, NIC won't support it) return the DHCP allocation
     and don't bring up the interface.

   - ping tests use small packets or full-MTU packets
     as appropriate (the current spec allows any packet
     size in ping tests, which won't do the desired thing
     with some exotic misconfigurations).

-- 
  Glen Turner         Tel: (08) 8303 3936 or +61 8 8303 3936
  Network Engineer          Email: glen.turner@aarnet.edu.au
  Australian Academic & Research Network   www.aarnet.edu.au
-- 
  linux.conf.au 2004, Adelaide          lca2004.linux.org.au
  Main conference 14-17 January 2004   Miniconfs from 12 Jan

-- 
LinuxSA WWW: http://www.linuxsa.org.au/ IRC: #linuxsa on irc.freenode.net
To unsubscribe from the LinuxSA list:
  mail linuxsa-request@linuxsa.org.au with "unsubscribe" as the subject


Index: [thread] [date] [subject] [author] [stats]
Return to the LinuxSA Mailing List Information Page