[Sussex] Re: C programming help - again!

Geoffrey J. Teale tealeg at member.fsf.org
Wed May 11 04:57:36 UTC 2005


On Tue, 2005-05-10 at 20:17 +0100, Steve Dobson wrote:
> There is an old rule in software programming.  90% of the
> work takes 10% of the time.  If you feel your "almost
> there" then you probably think you've do 90% of the work.
> So now you can guestimate how long it will take to do the
> remaining 10%
> .

>From the Jargon file:

Ninety-Ninety Rule n. 

"The first 90% of the code accounts for the first 90% of the development
time. The remaining 10% of the code accounts for the other 90% of the
development time." 

Attributed to Tom Cargill of Bell Labs, and popularized by Jon Bentley's
September 1985 "Bumper-Sticker Computer Science" column in
"Communications of the ACM". It was there called the "Rule of
Credibility", a name which seems not to have stuck. Other maxims in the
same vein include the law attributed to the early British computer
scientist Douglas Hartree: 

"The time from now until the completion of the project tends to become
constant."


:-)

BTW.. Redbeard - don't be too disheartened by the number of problems
that Valgrind finds - I recently ran it over some code that had been in
use for several years and it found close to 400 problems, none of which
were the issue I was trying to solve in the code!

There is real value in getting a decent understanding of memory
allocation, reallocation and freeing so don't be discouraged, it's a
fight worth fighting.

Now there are those people who would argue that in this days of
languages like Java, Perl and Python we don't need to think about, let
alone understand, this kind of stuff.  Those people are tragically
wrong.

Firstly people who don't think about memory tend to end up using rather
a lot of it, in a virtual memory environment like you average linux
desktop or server this may not seem like such an issue, but it has
performance impacts and even VM has it limits.

More important than that though is that blithe reliance on a garbage
collcection system can cause real problems.  I've spent a lot of time
recently trying to track down a memory leak in a Python program (yes,
you heard me correctly).  It's not a small leak either.  

Crawling through the code checking that all references are removed,
tweaking the GC attributed and calling gc.collect() in all manner
strategically chosen places had almost no impact on the leak (I maybe
freed 0.2MB of a roughly 60MB leak!).

I started theorising  about the interaction of python and Gtk (both have
their own memory management implementations that might not play well
together) but that route proved fruitless.

... and then I came across this thread on the python developers mailing
list:

http://mail.python.org/pipermail/python-dev/2004-October/049480.html

If you can't be bothered to read it I'll summarise it simply:  python
_never_ frees memory used  by objects.  I'll say that again just so it
sinks in, Python never frees object memory (!!!!!!!).

The reasons are obscure (principally to do with pyalloc code not getting
the global interpreter lock and possible problems interacting with
multi-threaded systems) but the effect is important.  If you write a
python program that you expect to run for a long time then it's total
allocated memory will reach a peak and pretty much stay there, no matter
what you do.  Combined with a memory reuse policy that can result in a
fiar bit of fragmentation and it's swap till you drop time.

There is a patch under development to try and fix this problem in Python
2.5 (or at least alleviate it a little) but it's a long way from being
accepted and it seems that the Python developers haven't viewed this as
a priority because they think Python is a "scripting" langauge - i.e.
they don't expect people to write long running processes in it.




-- 
Geoffrey J. Teale <tealeg at member.fsf.org>
Free Software Foundation





More information about the Sussex mailing list