[Gllug] Combining SMB and NFS - was Embedded Linux & 1Gbps?

Thu Oct 18 20:26:18 UTC 2007

On 18 Oct 2007, John Hearns uttered the following:

> Nix wrote:
>> On 17 Oct 2007, John Hearns spake thusly:
>>> Panasas - parallel, object-based filesystem. Runs on BSD hardware, over 
>>> an iSCSI transport.
>> 
>> A hardware-specific filesystem?! Or is this not a proper filesystem (a
>> spec and/or sample implementation) but a `we sell you a box' sort of
>> deal? That hardly counts in my book :/
>
> Don't knock it till you have tried it.

I'm not knocking it, as such: I just consider a distributed filesystem
that's tied to one vendor a complete and utter waste of time. It's like
anything else that's tied to one vendor: useful knowledge until the
vendor goes bust, and then you might as well not have learned anything
about it.

They may have cool ideas, but until those ideas have multiple
reimplementations or are in some other way proof against death (as free
software provides), then it's not worth considering. (If patents are
involved, there's an active reason to avoid going anywhere near them, at
least if you plan to sell anything based on those ideas in the US,
because you might be contaminated.)

> The core of their product is the ActiveScale filesystem - object-based, 
> parallel filesystem.

Another object-based store. I'm not sure what to think about these:
disks are starting to incorporate them, and they seem at first like a
neat idea, but it seems to me that they take a lot of useful degrees of
freedom away from the kernel. How can you do sensible readahead or
things like XFS's guaranteed-contiguous guaranteed- bandwidth file
streaming if you don't know how files are intermingled, or even if they
are?

> Yes, the filesystem comes with supported/certified hardware to run it 
> on. In my book, as someone who has to support this stuff, thats a big 
> plus.

Yeah, it is, until you want to use those nifty features *without* paying
insane amounts of money. (These things *always* cost insane amounts of
money: one vendor, oh, and they're competing against NetApp, and
probably marketing exclusively to corporations. Thus very likely they're
hugely expensive, oh and also not something I'm ever going to get to lay
my hands on, seeing as how I'm on the programming and not sysadmin
side.)

>       You don't get into the 'well, that motherboard has driver issues 
> under XXX kernel, well that array has a faulty doo-dah, those particular 
> drives have firmware which needs an update " stuff.

Personally I've never had such problems with any drives I've bought in
the last fifteen years, since SCSI stopped being a takes-yer-luck
compatibility-hellfest. (But then I'm only buying drives for my own use
--- but I'm buying them with my own money, so I've got good reason to be
careful.)

As for the hardware RAID arrays, I've stayed away from them for
*exactly* these reasons. Linux md may have theoretically lower peak
performance due to PCI bus saturation but in practice it's incredibly
reliable, the source is available so if the array should go south I can
cook up my own repair tools (and I've done it once after an unfortunate
disk scribbling incident caused by a housemate's buggy code running as
root, *sigh*), I can never have `incompatible drives', I can RAID
*anything* together even if it's remote or not a disk at all, Neil Brown
is involved who is an incredible wizard and fantastically helpful... no,
hardware RAID loses on almost all fronts as far as I'm concerned. It's
sole advantage is corporate CYA (`if it goes wrong we have someone to
blame / a third party to fob off the recovery onto'), and that's
something I don't give a jumping damn about. I'm a control freak with
regard to my own systems: if something goes wrong it had better be *my*
fault so I can fix it and learn from it.

> Panasas certify and support things end-to-end.

See, that's the exact opposite of what I'm interested in. You don't
learn anything useful from black boxes.

> Let's make a comparison - would you expect to buy an enterprise storage 
> array like EMC or a Pillar (etc.) and then go out and slap onto it RAID 
> arrays you bought fron anywhere you fancy, just because they were cheaper?

I don't expect to ever have the money to buy an enterprise storage
array, and given that they don't actually work incredibly much better
than a simple cluster of commodity machines except under utterly insane
load (in which case you sling in more commodity machines) I don't expect
to ever buy such things.

The Google Lesson is that the way of the future is lots of commodity
machines, widely-replicated. Expensive storage stuff is a completely
ass-backwards way to go about things, IMNSHO.

>>>                     The director blades take care of metadata serving,
>>> and also act as pretty good NFS and Samba servers should you have other 
>>> boxes, ie. Unix boxes or Windows PCs wishing to access the data.
>> 
>> i.e. it inherits all the failures of those protocols? 
> They simply give you NFS and CIFS connectivity for free, as the director 
> blades are already running BSD.

Yeah, but the *protocol* is no better, is what I mean. I'm interested in
a decent distributed filesystem protocol without NFS's single points of
failure or everything else's extreme non-POSIXness. An expensive storage
box talking NFS and SMB really, *really* doesn't provide that.

> Or does it provide
>> some layer which maps the object-based data store to something
>> POSIX-like? (Complete POSIX compatibility would be nice: support for
>> cross-directory hardlinks and not violating the atomicity guarantees is
>> a must on any modern distributed FS, I'd say.)
> ActiveScale is a POSIX compliant filesystem.

You said it talked NFS. NFS is not POSIX-compliant.

Or do you mean it provides source for a kernel module that talks some
other protocol? (Binary-only lumps are, obviously, not interesting. The
Sun Lesson is that if you want your distributed FS to take off, open the
specs and the source. NFS was quite crappy but it took off nonetheless,
where AFS did not.)

> I can't find a reference, but as I recall Lustre and Panasas grew out of 
> requirements for the same project.

Not surprising. Sometimes I think there are only about fifty FS hackers
on earth... :)

-- 
`Some people don't think performance issues are "real bugs", and I think 
such people shouldn't be allowed to program.' --- Linus Torvalds
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug