[Gllug] code optimisations

Nix nix at esperi.org.uk
Wed Mar 23 11:20:07 UTC 2005


On Tue, 22 Mar 2005, Rich Walker stipulated:
> Also, on some of the Athlon-and-later systems, the memory controller can
> support multiple fetch streams, making
> 
>  for (i=0; i<1000000/2; i++)
>    for (j=1; j<10000000; j++) {
>      a[i][j]++;
>      a[i+1000000/2][j]++;
>  }
> perform faster than the naive version.

That requires loop peeling and the value range optimization to work, I
think (at least: even that doesn't provide an optimization that will
adjust i like that, although it's an interesting idea, related to the
autovectorization stuff that's going into 4.1).

This is all 4.1 material again, at least.

(I bet the Sun compiler had a special-purpose speed-up-SPEC
optimization...)

> But it's ... interesting ... to communicate to a C compiler that the
> second optimisation is valid. If you did:
> 
>   void foo(int ** __restrict__ a) { }

As long as i and j are locals, I think that is acceptable. The compiler
knows that a[i][j] and a[i+1000000/2][j] cannot alias :)

> then you might expect it to happen, but I'm not sure it would. The use
> of __attribute__ ((vector_size(16))) applied to the type of a might

You shouldn't need to use this; vector_size has no effect with respect
to arrays anyway, and even if it did, the (ISO C99) parameter
declaration would be something like

int *a [restrict __attribute__((vector_size(16)))]

only GCC doesn't (yet) support the use of __attribute__ there.

-- 
This is like system("/usr/funky/bin/perl -e 'exec sleep 1'");
   --- Peter da Silva
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list