[Gllug] code optimisations
Nix
nix at esperi.org.uk
Thu Mar 24 10:37:55 UTC 2005
On Wed, 23 Mar 2005, Chris Ball whispered secretively:
> On Tue, Mar 22, 2005 at 01:16:25PM +0000, Rich Walker wrote:
>> Also, on some of the Athlon-and-later systems, the memory controller can
>> support multiple fetch streams, making
>>
>> for (i=0; i<1000000/2; i++)
>> for (j=1; j<10000000; j++) {
>> a[i][j]++;
>> a[i+1000000/2][j]++;
>> }
>> perform faster than the naive version.
>
> This is standard autovectorization, and working in gcc-4.0 when you use
> -O2 and -ftree-vectorize.
Is it?
Wow. I hadn't been paying much attention to the autovectorization branch:
obviously I should pay more :)
---- ah, of course, I'm blind, there are provably no aliasing problems
here because the loop bound for i is halved, and the compiler knows
this because it adjusted the loop bound in the first place.
>> But it's ... interesting ... to communicate to a C compiler that the
>> second optimisation is valid. If you did:
>
> Quite. The compiler can't always know that there's no data dependency
> inside the loop. The current plan for autovect-branch is to control
> the vectorizer with a #pragma (which is apparently how icc does it).
Eeew. I can't help feeling that there must be a better way :(
--
This is like system("/usr/funky/bin/perl -e 'exec sleep 1'");
--- Peter da Silva
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list