[Gllug] Odd (but apparently correct) behaviour from du

Nix nix at esperi.org.uk
Thu Jul 17 08:43:23 UTC 2008


On 16 Jul 2008, John Winters told this:
> The results which came back were startlingly variable.  Instead of a
> consistent 30G or so (slowly increasing) in each snapshot I got wildly
> varying values.  One night's snapshot would be 31G and the next night's
> apparently 2G.  Looking at them manually however they all seemed to be
> complete.

That seems very strange.

du's process_file() handles hard links by hashing every (inode, dev) it
finds for inodes with a link count >1, then not accumulating sizes or
names for inodes it's seen already (unless --count-links is active).

Therefore you shouldn't see *varying* output unless your filesystem
is returning readdir() results in a different order every time du runs
(which is possible but really rather unlikely).

> I got the right answer.  Now hands up all those who knew that du behaved
> this way.

Er, I thought it was obvious that it had to do something like this from
the moment I first met du. I don't see how else it could possibly give
the right totals in the presence of hard links.

(Now I think about it I suppose it could accumulate sizes including all
hard links for subdirs and then eliminate the duplicates when the
parent's size is shown... but finding the relevant duplicates without
making du scale as O(nm^2) in the number of hard links and non-leaf
directories might be tricky.)
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list