[Gllug] Patterns of locality of reference in xterm
nix at esperi.org.uk
Mon Jun 20 12:26:48 UTC 2011
(<adrian at newgolddream.dyndns.info> bounces: removed from Cc.)
On 20 Jun 2011, James Courtier-Dutton spake thusly:
> For CPU performance, most people just throw hardware at the problem.
> i.e. 1 Server is not quick enough, lets just have a cluster instead or
> use the Cloud to have as many CPUs as we need, when we need them.
> People now work on implementing algorithms so that they can be run on
> multiple CPUs/Threads at the same time, so scale well as the number of
> CPUs increases rather than implement algorithms that work well on one
> So, as people move to optimize for a multi CPU environment, not so
> much effort is being spent on the single CPU optimization.
GCC's saved from that, mostly, because 'make -j' does the parallelism,
but it did need to gain the ability to parallelize link-time
Reducing the dcache hit of programs will improve their performance
regardless of how multithreaded they are, not least because many CPUs
now have caches shared across cores.
> Even I was surprised how much quicker my optimizations made it,
First optimizations are always fun. There's so much low-hanging fruit :)
> Regarding code localization optimizations, one has to somehow work out
> the cost of an extra JUMP instruction is against the cost of a cache
You may also need to consider the (huge) cost of a branch misprediction
and pipeline stall.
> I.e. To get a bunch of code together in memory that it executed
> in a loop, one might need to add a JUMP between the loops so that you
> can fit the code used by the loop into the local area.
> I don't think there is anything in GCC that even attempts this apart
> from the "inline" option.
Inlining, if anything, does the opposite. You probably want
-freorder-functions and -freorder-blocks-and-partition, though both of
these require profile feedback so are relatively rarely used.
NULL && (void)
Gllug mailing list - Gllug at gllug.org.uk
More information about the GLLUG