[Nottingham] I knew I'd forget something last night (filesystems and partitions)

Martin martin at ml1.co.uk
Sat May 23 00:52:11 UTC 2020

On 22/05/2020 15:53, Andy Smith via Nottingham wrote:
> Hello,
> On Fri, May 22, 2020 at 02:03:54AM +0100, Martin via Nottingham wrote:
>> Good luck!
> Apart from having to remount (I chose actually to reboot) it's going
> okay so far…

> May 22 12:20:57 specialbrew kernel: [22232153.867485] ata6.06: ATA-11: Samsung SSD 860 QVO 2TB, RVQ01B6Q, max UDMA/133
> May 22 12:20:57 specialbrew kernel: [22232153.871817] ata6.06: 3907029168 sectors, multi 1: LBA48 NCQ (depth 31/32)

> (After reboot this became sdf; the faulty device appeared again as sdj)
> $ sudo btrfs replace start -B -r /dev/sdj /dev/disk/by-id/ata-Samsung_SSD_860_QVO_2TB_serialnohere /srv/tank
> $ sudo sh -c 'while true; do btrfs replace status -1 /srv/tank; sleep 600; done' | ts

> May 22 14:45:48 23.1% done, 0 write errs, 0 uncorr. read errs

Good stuff!

> If I keep it mounted degraded all the time, does that mean I can
> remove/replace devices at will?


The only gotchas are:

When removing a btrfs raid disk, take care that there is enough space on
the remaining disks to rebalance all the data from that (to be) removed
disk (especially so if you are using unequal sized disks);

For any disk changes, once the system has settled it is always a good
idea to manually start a scrub and a rebalance;

Allow enough time for any rebalance/remove!


Obvious reminder, *first* always have an up-to-date tested backup!

> Other storage innovations going on:
> Red Hat's Stratis: https://stratis-storage.github.io/StratisSoftwareDesign.pdf
> This uses XFS and device mapper underneath but a userland daemon to
> manage it all so you only interact with that. Will support
> encryption, integrity (checksums), cache tier, snapshots, device
> add/remove, and typical volume management. Though much of that is
> "in a later version".
> mdraid on LUKS on dm-integrity:
> https://gist.github.com/MawKKe/caa2bbf7edcc072129d73b61ae7815fb
> Integrity and encryption.
> Or use LVM on top of (or instead of) md to get full volume management, snapshots etc.
> You could add a cache tier with dm-cache.
> All of this is basically what Stratis is doing, but you do it by
> hand using the lower level tools.

Thanks for those. Sounds like I *was* a 'hyper-Stratis' in my LVM days!


Tested bcache back in the day but didn't get enough of an improvement to
be worth adding an extra point of failure. Since then, there is bcachefs
developing steadily...

Still, a very nice idea to wrap up managing all the stack layers in a
user-friendly tool. A little too late for my usage!

That stack is good if encryption is needed.

LVM is still very good, but it now has been superseded by that
functionality moving into the newer filesystems directly. For example,
the LVM /snapshots/ were excellent back in their day. However, today you
have available atomic snapshots with btrfs inherently 'for free' due to
the CoW (RoW) design, as opposed to in LVM having to reserve a 'big
enough' LV for the snapshot space to accommodate a brute force copy of
the original data for any modified data blocks... Indeed, with btrfs,
the problem is more the tendency/laziness for distros to automagically
run silly numbers of snapshots into bewildering numbers, just because
'you can'!

Initially I was leery of amalgamating all of the functions of raid and
disks management and partition management and snapshots and backups
syncing all within the one filesystem layer. At the time, we very nicely
had the well layered clean discrete bombproof implementations of disk
partitions, mdraid for the raid, LVM for disk management and snapshots
(there's also LVM raid...), and a plethora of userspace utilities for
any type of backup you might want. So why reinvent all that in btrfs?...

Following the development ideas, I was soon won over to CoW and the new
greater holistic functionality, and the performance and flexibility
gains that came with that.

(Nope, special note, that is most definitely NOT an endorsement of the
fragile monolithic poorly implemented bloated overcomplication that is
systemd... ;-) )

> I'd be really interested to see benchmarks of this versus just
> LVM-on-MD, with and without LUKS, and against ZFS.

That just has to have already have been benchmarked on Phoronix! ;-)

Well... Nearly... Not your full stack but see:

Benchmarking The Experimental Bcachefs File-System Against Btrfs, EXT4,

Also note the recent news:

Oracle Talks Up Btrfs Rather Than ZFS For Their Unbreakable Enterprise
Kernel 6

(OK... flame-fest warning, just ignore the name Oracle!...)

Hopefully some of that is of interest.

Good luck!

CoW: Copy on Write


Broken by design: systemd

The Verdict On systemd Is In

No systemd

More information about the Nottingham mailing list