[Gllug] Odd (but apparently correct) behaviour from du
C. Cooke
ccooke-gllug at gkhs.net
Thu Jul 17 09:14:02 UTC 2008
On Thu, Jul 17, 2008 at 10:09:33AM +0100, John Winters wrote:
> Nix wrote:
> > On 16 Jul 2008, John Winters told this:
> >> The results which came back were startlingly variable. Instead of a
> >> consistent 30G or so (slowly increasing) in each snapshot I got wildly
> >> varying values. One night's snapshot would be 31G and the next night's
> >> apparently 2G. Looking at them manually however they all seemed to be
> >> complete.
> >
> > That seems very strange.
> >
> > du's process_file() handles hard links by hashing every (inode, dev) it
> > finds for inodes with a link count >1, then not accumulating sizes or
> > names for inodes it's seen already (unless --count-links is active).
> >
> > Therefore you shouldn't see *varying* output unless your filesystem
> > is returning readdir() results in a different order every time du runs
> > (which is possible but really rather unlikely).
>
> Possible misconception here. When I said "varying" I didn't mean
> "varying from run to run" - just "varying from snapshot to snapshot".
> If you run it again you get the same result.
On the other hand, it's doing exactly the right thing here - it's
telling you how much space each snapshot takes. After all, using hard
links means you're taking incremental backups; it's expected that
subsequent snapshots of a largely unchanged tree will be smaller.
> >
> >> I got the right answer. Now hands up all those who knew that du behaved
> >> this way.
> >
> > Er, I thought it was obvious that it had to do something like this from
> > the moment I first met du. I don't see how else it could possibly give
> > the right totals in the presence of hard links.
>
> I suppose it depends what you're expecting it to do. When I invoked it
> with a list of directories I rather expected it to tell me the disc
> usage for each of those directories separately. The author on the other
> hand clearly expected it to provide values which could subsequently be
> added together to give a correct total for all the directories together.
> Once you're working towards the latter objective then clearly it does
> need to work in the way which it does.
>
du is supposed to give you a reading of disk usage; if it were able to,
by default, tell you that you have used more disk space than you *have*,
that would quite rightly be a bug.
If it helps, you might like the -l (--count-links) switch, which tells
du to count hard-linked files multiple times.
--
d=(1 0 6 0 1 0 5 5 41 5 3 12 4 5 15 1 4 -2 5 5 0 5 4 24 3 5 27 1 3 -2 1 3 6)
a=0;while :;do ((v=(c=a)+3));((x=d[d[a]]-d[d[a+1]]));d[d[a]]=$x;((a=d[d[a]]\
<0?${d[a+2]}:v));case $a in -1)read d[d[c]];a=$v;;-2)echo ${d[d[c+1]]};a=$v\
;;0)exit;;esac;done 2>&- # Charles Cooke, Sysadmin.
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list