[Gllug] code optimisations
Nix
nix at esperi.org.uk
Wed Mar 23 11:20:07 UTC 2005
On Tue, 22 Mar 2005, Rich Walker stipulated:
> Also, on some of the Athlon-and-later systems, the memory controller can
> support multiple fetch streams, making
>
> for (i=0; i<1000000/2; i++)
> for (j=1; j<10000000; j++) {
> a[i][j]++;
> a[i+1000000/2][j]++;
> }
> perform faster than the naive version.
That requires loop peeling and the value range optimization to work, I
think (at least: even that doesn't provide an optimization that will
adjust i like that, although it's an interesting idea, related to the
autovectorization stuff that's going into 4.1).
This is all 4.1 material again, at least.
(I bet the Sun compiler had a special-purpose speed-up-SPEC
optimization...)
> But it's ... interesting ... to communicate to a C compiler that the
> second optimisation is valid. If you did:
>
> void foo(int ** __restrict__ a) { }
As long as i and j are locals, I think that is acceptable. The compiler
knows that a[i][j] and a[i+1000000/2][j] cannot alias :)
> then you might expect it to happen, but I'm not sure it would. The use
> of __attribute__ ((vector_size(16))) applied to the type of a might
You shouldn't need to use this; vector_size has no effect with respect
to arrays anyway, and even if it did, the (ISO C99) parameter
declaration would be something like
int *a [restrict __attribute__((vector_size(16)))]
only GCC doesn't (yet) support the use of __attribute__ there.
--
This is like system("/usr/funky/bin/perl -e 'exec sleep 1'");
--- Peter da Silva
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list