[Wolves] Storage advice.

Chris Ellis chris at intrbiz.com
Mon Jan 15 23:46:39 UTC 2024


On Mon, Jan 15, 2024 at 9:31 PM Simon Burke via Wolves
<wolves at mailman.lug.org.uk> wrote:
>
> Hi,
>
> Odd position I'm in at day job.
>
> I have a VM (VMWare) running Oracle Linux 9, with two 50tb VMDKs attached.
>
> The issue is that I need to span them, and it'll store a lot of tiny files (circa 80Tb in 2-3mb files), with a high daily rate of change.
>
> Choosing the right method to achieve this is proving a challenge, when I've got to get this going this week. So testing is limited.

What is proving challenging?

>
> I can't have a single large VMDK due to a 62Tb limit in VMWare.

Can you go around that and just iSCSI direct into the VM?

>
> I can just use ext4 and LVM, or Oracle Linux likes Btrfs. ZFS is provisionally out unless I get a third 50Tb disk, for ZRAID+1.

How come XFS is not an option?

IMHO it would likely be the best case for this situation.  Great
performance, especially parallel allocations, metadata split into
allocation groups, good handling of small files.  In my experience
very robust and good recovery options, and no need to
periodic fsck.

Also XFS has really good trim and debug tracing support and you can
align the allocation groups to the underlying RAID stripes.

>
> Opinions?

I assume the VMDK's are backed onto a SAN of similar?

In which case they are essentially within the same failure domain,
therefore I would:
  a) Just stripe them in RAID 0 with MDRAID
  b) Just use LVM to do a PV span
  c) Use BTRFS's built in RAID and map both devices into a single BTRFS

If you don't really trust the virtual disks, I'd suggest you mount
more smaller volumes and RAID 5 inside the VM.

We used to run out databases on AWS ephemeral disks all in XFS +
MDRAID 0, treating the VM as a single failure domain.

BTRFS does at least have checksumming, which can be helpful to detect
errors.  However running out of space is a PITA
and recovering from data corruption (caused by bad RAM) has been 50-50 for me.

If performance is your main problem, I'd go XFS + MDRAID0, increase
the number of allocation groups and stripe align them.

>
> Or I can do some basic tests and report results if anyone is interested?
>

Chris



More information about the Wolves mailing list