Expressive programming language? Was: [Gllug] Speaker Suggestion

Fri Jan 9 22:58:24 UTC 2004

On Fri, 9 Jan 2004, Doug Winter yowled:
> http://www.computer.org/computer/thompson.htm
>> In addition, the language implementation - and again I don't want to
>> take any credit - doesn't have big mark-and-sweep type garbage
>> collection. It has reference counting: If you open something and then
>> return, it's gone by reference count.
[snip]
>> So, again, it's pragmatic. It's not the theoretical top-of-the-line
>> garbage collection paper. It's just a way of doing it that seems to be
>> very, very effective.
> 
> I vote for simple GC every time.  The Java guys can go back to writing
> papers for each other - which they must do far better than writing
> operational software.

Simple? I'll say. I really hope the author (not Ken, as he's careful to
say) had actually considered other GC methods. Unadorned ark and sweep
and reference counting are horrible stone-age GC algorithms; indeed
the latter, while indeed useful, barely merits the name GC at all,
as it misses bits without extra code to detect cycles. It's a component
of a GC algorithm, but certainly not a GC algorithm in itself.

If you're writing a language implementation and so can keep track of
type information at execution time, and particularly if you have any
kind of decent global optimizer, a decent type-accurate GC system using
liveness info from that can *immediately* deallocate a lot of storage
*without needing reference counts*, and rely on more complex GC only for
data structures with more complex allocation patterns. You can't beat
O(1).

The `more complex algorithm' should almost certainly not be mark+sweep
collection, and probably not refcounting either. I'm partial to
incremental generational techniques, not least because there are ways to
do generational GC which avoid copying, so you don't even smash the
dcache the way naive generational GCs or mark+sweep do.

The last time I wrote an allocator, for a language without
type-accuracy, I used a double-collector: the allocator allocated
objects of very similar sizes in slabs, and ran Baker's treadmill on
each slab, spinning the treadmill for slabs with more items on more
often than that for ones with fewer. Most programs have huge numbers of
a very few objects, so this turned out to be much more effective than
I'd expected.

A simpler variant of this is discussed in brief in Jones & Lins, s8.8 (a
damned good book if you're interested in GC algorithms).

Combining this with type-accuracy and flow feedback from the compiler
would keep the slabs quite small to boot.

-- 
As they say, build a better mousetrap and the world will beat a
path to your door. But nobody ever got anywhere outlawing mice.
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug