<div>I found this at the weekend, it's quite interesting:</div>
<div> </div>
<div><a href="http://tech.slashdot.org/comments.pl?sid=1277771&cid=28433901">http://tech.slashdot.org/comments.pl?sid=1277771&cid=28433901</a></div>
<div> </div>
<div>To quote:</div>
<div> </div>
<div>
<div id="comment_body_28433901">Mozilla does comparative performance testing for the best GCC compiler flags constantly. There are several reasons why our Linux builds are slower than Windows:
<ol>
<li>The Windows ABI is cheaper: every relocated symbol in Linux is resolved at runtime by loading the PIC register and going a GOT lookup. Windows avoids PIC code by loading the code at a "known" address and relocating it at startup only if it conflicts with another DLL.</li>
<li>Mozilla code runs fastest when 99% of it is compiled for space savings, not "speed". Because of the sheer amount of code involved in a web browser, most of the code will be "cold". Tests have shown that at least on x86, processor caches perform much better if we compile 99% of our code optimizing for codesize and not raw execution time: this is very different than most compiler benchmarks. The MSVC profile-guided optimization system allows us to optimize that important 1% at very high optimization levels; the GCC profile-guided optimization system only really works within the confines of a particular optimization level such as -Os or -O3. In many cases using PGO with Linux produced much *worse* code!</li>
<li>The GCC register allocator sucks, at least on register-starved x86: we've examined many cases where GCC does loads and saves that are entirely unnecessary, thus causing slowdowns.</li></ol>
<p>Believe me, we'd really love to make Linux perform as well as Windows! We spent a lot of time in Firefox 3 with libxul reducing startup time by making symbols hidden and reducing the number of runtime relocations...</p>
</div></div>