[Gllug] socket buffer overrun
Peter Grandi
pg_gllug at gllug.for.sabi.co.UK
Wed Oct 19 18:00:07 UTC 2005
>>> On Wed, 19 Oct 2005 14:57:42 +0100, Tethys <sta296 at astradyne.co.uk>
>>> said:
[ ... ]
>>> Are you aware of the existence of TCP/IP accelerators?
>> no. I will google.
sta296> Also known as TOE -- TCP Offload Engines.
Well, acceleration coms in two major forms, partial as in TSO and
full as in TOE. Also, there are nice litle accelerations like
checksumming that help. Most chipsets do checsum and segmentation
acceleration, but that's not necessarily happening in sw even if
available in the hardware.
sta296> Also widely debunked by the Linux networking guys as
sta296> unnecessary.
It all depends on context, and for people like the «Linux
networking guys» who nowadays operate in ''cost-no-object''
environments at vendors who care most about mega-system
performance things can be different from the guy trying to do
service over a gb network switched and cards have become very
cheap) with old stuff...
sta296> There's some debate about whether they actually make a
sta296> difference at all, but wide agreement that if they do, it
sta296> doesn't come into effect until you're using 10Gb/s cards
sta296> and above.
Ha! It is not as simple as that.
I would qualify the «come into effect» above with ''if you got
TSO support, and a 2.6 kernel, and a fast CPU, and not much else
is running, then there is little point in full offloading,
because there is not much to offload and not much need to.''.
Indeed I would not be at all surprised if some TOE cards were
just a fast CPU with a Linux 2.6 kernel and a TOS capable
commodity gb chipset and obviously doing nothing else but TCP
processing (the price of TOE cards is high enough that the above
is quite possible).
But as to both forms of TOS and TOE and the OP's question here
are some notes, which go towards explaining why I wrote as hint
«Are you aware of the existence of TCP/IP accelerators?»:
* Not knowing whether the OP's system was beefy or not, unloaded
or not, either TOS or TOE might have well been a help. Some
servers at gb rates do become CPU bound, especially if slow
and some other high CPU stuff is running (hey, my puny 0.8GHz
laptop becomes CPU bound at around 3-4MiB/s on Fast Ethernet
if other things are running on its CPU). Now that it is known
(I asked specifically) that it is beefy and unloaded, probably
TOE is unnecessary.
* Not knowing what kind of gb card the system had, it was
unclear whether TSO was available and/or enabled. Now that it
is known (I asked specifically) that it is a 64 bit card, odds
are it is relatively recent/high end and has TSO support, so
it likely has significant TCP/IP acceleration, and TSO does
quite help (and I mentioned things like interrupt coalescing
etc. that relate to that).
* However it was known that kernel 2.4.21 was being used, and
this does not have the same TSO support that 2.6 has (and I
did mention in passing a 2.6 kernel had some improvements that
might be relevant). So perhaps just because of that a TOE
might help, because a lot of the ''TOE not necessary on a
beefy unloaded system'' story depends on TSO and 2.6 being
available.
* But 2.4.21 can be patched, and the distribution was not
mentioned. Now that it is known (I specifically asked) that it
is RHAS 3 (as I sort of had guessed) there is a chance that
among the many backports in RHAS 3 there is TOS support. As I
stated before, I don't know. But if not, but a TOE driver is
available, then TOE might help.
* Since 2.4.21 was likely to mean RHAS 3, and RHAS 4 has been
out of a while, it could have been an old slower system, not
a newer faster system that probably as hinted uses RHAS 3 as a
matter of policy, not be
* Still, TSO mostly does help on transmit, and there were some
indications that it was a receive issue, so perhaps even TSO
was not an issue.
* But even if in theory one has a system that properly tuned
does not need a TOE, just buying and plugging in a TOE might
solve at one stroke the problem and save time and aggro with
tuning. Since it was my guess (later confirmed...) that the OP
was in a time panic, that was something worth mentioning.
So, all this said, I wrote «Are you aware of the existence of
TCP/IP accelerators?» as a hint on investigating a possible
shortcut, as it was not being clear whether the system was fast
or slow, or much else (as I have complained).
And yes, I did in a flash think about all those issues above
(and below) when writing tersely that rhetorical question.
But if I have to write out explicitly all the reasoning behind
what I write, because somebody snipes on little notes, I end up
writing ten page articles like those I occasionally provide the
links to, not brief memos.
Well, it took me longer to write this wittering explanation of
the flow of my guesses than to write the original reply...
However I hope that showing how complicated situations can be,
somebody may benefit. Also, some random links:
http://WWW.CS.Duke.edu/~jaidev/papers/ispass03.pdf
Nice paper on how much TCP costs to process on relatively
recent systems. Quote on CPU vs. bandwidth:
«The generally accepted rule of thumb is that 1bps of network
link requires 1Hz of CPU processing. [ ... ] It had held up
remarkably well over the years, albeit only for bulk data
transfer at large sizes.
For smaller transfers, we found the processing requirement
to be 6-7 times as expected.
Moreover, the figures show that network processing is not
scaling with CPU speeds. The processing needs per byte
increase when going from 800MHz to 2.4GHz.»
which may yet again illustrate why I asked about CPU speed,
and type of transfers, as sustaining 1gbps requires a 1000MHz
CPU 100%, and 30% of a 3GHz CPU, and much more than that for
smaller transfers; 6-7GHz (and more because it scales
sublinearly) per gbs is steep.
http://IT.SlashDot.org/article.pl?sid=05/06/21/014243&tid=230&tid=218&tid=137&tid=106
A whole SlashDot thread with the some interesting comments
on the history of TCP/IP accelerators. Who remembers what
were the comms accelerators of a DEC-10/20? :-) An entry
says:
http://IT.SlashDot.org/comments.pl?sid=153391&cid=12869212
«I expect you can get an Intel 1000/Pro for around $30;
full TCP/IP checksum offloads in both directions,
interrupt moderation, jumbo frames, and Intel even write
their own open source drivers.
Heh, my on-board Realtek GigE chip has checksum offloads
too, but even with them on, 300Mbps would have me up to
70% system/interrupt CPU load (and I hear the checksumming
is a bit.. broken); I barely scrape 30% with a PRO/1000GT.»
Even a 30% CPU load (unknown CPU speed) at 300mbps looks
something that on some systems could cause trouble.
http://WWW.LinuxJournal.com/article/4896
Article on a TOE card from a few yars ago, with interesting
details. It is an embedded system based on a ''proprietary
kernel'' with the BSD TCP/IP stack, and with I2O API to the
rest of the system. The most impressive benefits seems to be
IRQ reduction (as it should).
http://WWW.AARnet.edu.AU/engineering/networkdesign/mtu/nic.html
Comparison of two popular gb chipsets, and very short obvious
list of what matters at the beginning.
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list