[Gllug] Prefetch on Pentium4

Per Gregers Bilse bilse at networksignature.com
Mon Dec 22 18:42:24 UTC 2003

I wonder if I could ask a couple of P4 owners to lend me a hand.

I need to determine the effect (beneficial or otherwise) of explicitly
coded memory prefetch on Pentium 4 CPUs in a particular application
likely to have a high cache miss ratio.  I already have data for a
selection of AMD processors, and results of the test I've constructed
range from a speed decrease of 26% to an increase of over 50%, depending
on chip and strategy.  Hence, mileage is highly variable, and the fact
that Intel and AMD have gone in opposite directions in terms of cache
philosophy means there's no way to guess what Intel CPUs will do.

I've packaged the test up together with test data and a Makefile at:


All that's needed to run it is to download, untar, and then 'make mail';
the test runs for six minutes, and prints results as it goes along.

I don't need a lot of different results, and in particular I don't need
results from other CPUs, but anybody's welcome to play, of course.  Also,
I'm primarily interested in code produced by GCC v3.  If your compiler
barfs on the SPECIALS in the Makefile, it's because you're using gcc
2.something; gcc 3 adds a lot of architecture-specific detail.

Of course, the chipset is important too, but I'm generally speculating
that it won't make too much of a difference wrt what is the best
prefetch strategy (if indeed any).

I'll summarise to the list.


  -- Per

Gllug mailing list  -  Gllug at gllug.org.uk

More information about the GLLUG mailing list