[Gllug] Difference between "buffers" and "cached"?

Nix nix at esperi.org.uk
Fri Jan 27 11:41:43 UTC 2006


On Fri, 27 Jan 2006, Daniel P. Berrange moaned:
> Buffers basically refers to the kernel's caching of writes to block
> devices.

In effect it is a buffer keyed by (dev, block), so *anything* which does
not have an associated memory page lands in here.

In particular, filesystem metadata stays in the buffer cache.

Most dirty pages never go near it, AIUI: the dirty pages in the page
cache get flushed directly to the block device when a pdflush thread
gets around to it.

> Buffers are in the block device layer of the kernel; The cache meanwhile
> is higher up in the filesystem (VFS) layer, and refers to caching of the
> contents & metadata of files in RAM, so that read requests can be satisfying 
> immediately rather than having to pull the data in from disk every time. 

Semi-sort-of. *Anything* which has an inode number and is or was
recently needed in RAM is kept in here, which is a lot more than you
might expect: not just open files, but files that were open that haven't
been discarded yet, mmap()ed files, anonymous mappings, files in tmpfs,
even ridiculous cases like a file which is `open' only in that an open
handle to it is in transit over a unix-domain socket to another process
(this can of course be recursive in that that socket is only open
because *it's* in transit over another socket to another process).

Notably, because mmap()ed stuff is kept in there, all executables run
out of the page cache. So space spent on page cache is not even slightly
`wasted', and you can expect to see a good bit of space used on it even
on very low-memory machines.

>                            The cache data will be kept around pretty much
> forever, until an application runs which wants memory for its own use 
> at which point the kernel will release some of the cache.

Well, the kernel tries to balance page cache, buffer cache, memory
available for userspace and memory for more arcane things like the inode
cache and so on. It's a vast balancing act which is impossible to get
right without foretelling the future, and works much more often than I'd
expect. (2.6's memory balancing really is vastly better than in 2.4; I can
have some processes thrashing the system to death yet everything else is
still responsive. I think this may be due to the swap token stuff...)

>                                                        Basically under
> normal operation you should expect the kerenl to use pretty much all your
> free RAM for buffers & cache, only releasing it when an application needs
> it for 'real work'

A few percent will be kept free for atomic allocations (you can't always
fall asleep waiting for someone to swap something out: sometimes if you
need memory *right now*.)

> There are all sorts of thresholds & tunables which control how aggresively
> the kernel allocs & releases caches, to change the balance between good
> throughput and good interactivity. If you  have a 2.6 kernel (and particularly
> if its RHEL-4), then Neil Horman has written a very good document talking
> about how the VM works & how you can tune it 
> 
> http://people.redhat.com/nhorman/papers/papers.html

What good papers.

The RHEL4 one will needs updating at some point for RHEL5: it describes
3-level page tables rather than 4, the DMA32 zone, the swap token and
eventually CLOCK-PRO, and so on. And of course some of the things
described have never got into the mainstream kernel, e.g. kscand.

But it's a good introduction.

> [1] If the app writes huge amounts of data, it can eventually get blocked
>     during writes when the kernel runs out of room for buffers and has to
>     actually write some out immediatley.

In 2.4, if you're writing lots of data to a slow block device (e.g. a
packet-written CD-RW), then I/O to other block devices is backed up
behind it. The result is frequent apparent freezes when doing such
writes, because the system's trying to swap a page out or read a
page in, and it can't.

In 2.6, each block device has its own request queue, so this problem
dissolves.

-- 
`I won't make a secret of the fact that your statement/question
 sent a wave of shock and horror through us.' --- David Anderson
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list