[Gllug] code optimisations

Thu Mar 24 10:35:04 UTC 2005

On Wed, 23 Mar 2005, Richard Jones suggested tentatively:
> On Tue, Mar 22, 2005 at 10:42:12AM +0000, Nix wrote:
>> On Mon, 21 Mar 2005, Pete Ryland uttered the following:
>> > And by Cache efficiency, I mean things like organising memory usage to be
>> > more sequencial if possible, e.g.:
>> > 
>> > for (i=0; i<1000000; i++)
>> >   for (j=1; j<10000000; j++)
>> >     a[i][j]++;
[snip]
>> > I don't think a compiler would optimise that type of thing.
>> 
>> Optimizations that can do that sort of code motion in some cases are on
>> the cards for GCC-4.1. :)
> 
> Really?  I did my MSc thesis on this very subject :-)

How convenient :)

> We probably probably had a different approach to detecting the loops
> though - we ran the program and examined a trace of memory access,
> then used image recognition technology to "see" bad "patterns" of
> memory access

!!! That's... unusual. (Sounds like a case of `when all you have is a
hammer...' to me --- also likely to be rather slow).

GCC's profile feedback might be useful there, but I'd hope that a lot of
these optimizations can be flipped on automatically by knowledge of the
behaviour of the architecture's cache; the only thing we can't handle
there is things that require knowledge of the size of the L1/L2 cache
(as opposed to its structure), because that varies too much between
CPUs, at least on x86.

It might also sometimes be good to partially unroll tight loops into N
pieces where N is the number of cachelines, so you keep the cache
optimally full...

-- 
This is like system("/usr/funky/bin/perl -e 'exec sleep 1'");
   --- Peter da Silva
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug