[Gllug] Raw Partitions

Steve Nelson sanelson at gmail.com
Thu Nov 17 11:40:12 UTC 2005


On 11/16/05, Peter Grandi <pg_gllug at gllug.for.sabi.co.uk> wrote:

> It is quite recent, only from the end of May 2005, for 2.6:

Right - so irrelevant to me, since I am running 2.4.21-37.ELsmp.

<snip>

> The 'raw'(8) man page says more precisely and technically:

<snip>

I know.  I read it.  Reproducing it here is more noise.  You constant
nit-picking helps no-one on the list.  I clearly meant "reading a
combination of the raw and sg_dd manual pages, together with other
documention leads me to believe that for performing read/write
operations on a raw device sg_dd is a recommended tool.

> Indeed the 'raw'(8) page does not mention 'sg_dd' at all (even
> if instead the 'sg_dd'(8) page does mention 'raw').

Which it does.

> Now, that's not a «'bugs'», just a limitation

It is under the section 'Bugs'.  And I placed it in quotes to indicate
the bug//feature//limitation sense of the word.

> I have
> provided several links to versions of 'dd' that do not have
> that same limitation, and are not 'sg_dd'.

So what?  I didn't want them.

> As to 'sg_dd' its primary purpose like the rest of 'sgutils'
> is to do IO issuing direct commands ('SGIO') to SCSI[-like]
> devices, which also requires aligned buffers:
>
>   http://sg.Torque.net/sg/sg_dd.html

Thank you - that looks interesting.  I must confess not to know a
great deal about the internals of device I/O.

 <snip lot of stuff that the interested reader could look up>

> In other words, there is no necessary relationship between
> 'sg_dd' and 'raw'(8) and not being telepathic I was uncertain
> whether your question was more about 'sg_dd'/'SGIO' as such or
> really about 'raw'(8), as they are entirely different subjects.

My question was very clear, notwithstanding my imprecise use of 'raw
partition' - the question was clearly: How do I create a raw device
without reparitioning and binding a raw device to the new partition.

> sanelson> My aim was to attempt to perform read/writre
> sanelson> operations in a like manner to those performed when
> sanelson> starting clumanager on a cluster node.
>
> Using something like 'dd'? Of course you know better...

Sarcasm is never appreciated.

> , but I reckon that the IO patterns of a cluster system and those of
> something like 'dd' are nowhere alike

So do I.  Nor are *starting clumanager* and *IO patterns of a cluster
systen*.  Please read what I wrote.

> and any crashes in the
> former case are likely due to timing depending bugs that
> sequential reading is unlikely to uncover.

Interesting assertion - I would be pleased to understand why you make
it.  And as an aside, why do you spam us with vast amounts of
information when we don't want it, only to make an assertion without
citing any evidence to support it?

> Also, it transpires below that the 'raw' device is most likely
> bound to a GFS pool, and is perhaps used with AIO, which is
> quite different from binding it to another type of block device
> and doing sequential sync IO to it...

You really are displaying that you have no clue now.  I am fully aware
of what a GFS pool is.  Had this question been about a GFS pool I
would have mentioned it.

> sanelson> I'm not especially keen on repartitioning my test
> sanelson> machine so I can create a raw device. [ ... ]
>
> sanelson> If by this you are (somewhat pedantically) arguing
> sanelson> over whether these are 'real' raw devices, or simply
> sanelson> wrapper-like bindings,
>
> Note that in the above you used «repartitoning» thus sort of
> implying that one has to create a special ''raw partition'',
> when someone who had read 'raw'(8) would have known instead that
> one could bind a raw device to _any_ existing block device.

No.  I absolutely understand that I can bind a raw device to any block
device, *because* I have read the manual page.  My question was how to
create some kind of self-contained device within a partition.

> So, for example, if the test system has a swap partition, you
> can experiment with the raw device wrapper by binding one raw
> device to such a partition (if you want to write to it too, just
> disable swapping on that swap block device).

AT LAST!  Something genuinely helpful.  Thank you.

> >> Also, the 'raw' command and raw device wrappers are deprecated,
> >> depending on the kernel release (and did not seem to have much
> >> of an effect when I tried them, and seem to be little tested).

It is clear by now I am using RHEL 3 with a 2.4 kernel.  Raw devices
are *not* deprecated under this system.

> sanelson> Ah - ok - they may be unused on 2.6 kernels - I am
> sanelson> using 2.4 kernels at present, and GFS 6.0 relies upon
> sanelson> them. This is a heavily used an well-supported
> sanelson> configuration, so perhaps deprecated is a slightly
> sanelson> strong word.
>
> Well, the administrator manual for GFS 6.0 is here:

We know.  I use it regularly.  I have referred to it several times. 
You however, appear to have cherry-picked from it, and display a lack
of understanding of how GFS works.

<snip>

> But probably I am missing something here... Indeed, as it seems
> likely that the quorum is the _Oracle_ quorum, as one can read
> in section 2.6, page 16 of «Installing and Configuring Oracle9i
> RAC with GFS 6.0»:

NO! Please go and read the documentation beore you make more of a fool
of yourself.

>   http://WWW.RedHat.com/docs/manuals/csgfs/pdf/rh-gfsico-en-6_0.pdf
>
> where the raw devices are bound _on top_ of GFS pools:

This is not always the case.

<more snippage>

> BTW, interesting article about these issues, from the design and
> performance point of view, here:
>
>   http://WWW.VLDB2005.org/program/paper/wed/p1116-hall.pdf

Thank you - amazing how you manage to sprinkle little gems in your
emails.  I shall read this over lunch.

> As far as ''raw devices'' are concerned, the only relevant idea
> in 'sg_dd' is that it does aligned buffer IO, and that's just a
> minor side effect of that being required for 'SGIO' too.

Ok.  As I say - I need to do more reading about device IO.  Thanks for
the links.

> Then this seems your actual problem, not 'sg_dd' and 'raw'(8),

No.  I my 'problem' has never been sg_dd or raw.  I had a simple
question, which I have restated for your edification on several
occasions.

> and I am not that surprised -- '/dev/raw/' are a relatively
> obscure part of the kernel and would not expect them to be as
> thoroughly exercised as they should be; never mind stacking
> 'raw'(8) on top of GFS pools...

I have no idea what you mean.

> You can lead yourself to believe what you prefer -- I reckon
> however that it is useful to include small and obviously
> irrelevant details like distribution, the version of that
> distribution, kernel edition and minor version, applications
> causing issues, and symptoms doing specific operations, when
> asking for help about devices and drivers, as there are many
> issues device and driver issues that are rather specific to such
> details.

Yes - if I was asking for help with troubleshooting my cluster -
which, incidentally, I am not.

> Probably you should have asked something more useful like:
>
>   ''I have a RHES 3 system with the usual heavily patched RH
>     kernel 2.4.21 and when I run Oracle 10 on it over GFS 6.0
>     and there are crashes in the Oracle quorum manager using
>     'raw'(8) devices wrapping GFS pools, as described in this
>     link: [link omitted]. What can I do?

No - because that bears absolutely no resemblance to what I actually
wanted - which is what I asked in the first email.  It also shows that
you don't appear to understand how quorums work in Redhat
Clustersuite.

>     I am also considering checking whether there are issues with
>     'raw'(8) using another PC using some form of 'dd', how can I
>     set up that?''

Also not what I was asking.

> No mention of something very special purpose like 'sg_dd', which
> hints at a completely different set of issues related to 'SGIO'
> which are irrelevant here.

If you say so, Peter.

> The answer would have been:

>    * The 'raw' driver is likely to be somewhat unreliable, as
>      well as being deprecated.

Wrong - I am using a 2.4 kernel - which you even seem to have
understood, yet still insist raw is deprecated.

>    * 'O_DIRECT' is an alternative to 'raw'(8), so one could
>      check if Oracle can use that instead for quorum volume,

Wrong - we are not interested in the oracle quorum.

>      but there is the small matter of AIO. Situation murky.

Irrelevant.

>    * Since both 'raw'(8)/'O_DIRECT' and AIO are optimizations,
>      and sometimes of dubious value, using neither is probably
>      a good fallback option if one has problems.

Incomprehensible (at least to me)

>    * This is really a very particular situation, and I suspect
>      that is one of the case where having an Oracle and/or
>      RedHat support account is pretty useful.

I have Redhat support, and it was on the basis of Redhat support that
I began doing tests reading and writing to the raw devices.

>    * Testing 'raw'(8) using 'dd' is probably pointless, as the
>      usage pattern is going to be too different

Possibly true.

>      but if you really want to do that, bind a 'raw'(8) to a newly created
>      temporary GFS pool

Why?  That bears no resemblance to my system.

>      and use a 'dd' [list omitted] that does
>      aligned buffer IO.

Such as sg_dd.

> You could use a 'losetup' block device
> over a file to simulate a block device, but that would make
> the test even more unrealistic and risky.

Thanks to thos who already suggested using losetup.  Yes, I agree it
might be, however, an unrealistic test.

> I think that the above is more the level and style of direct,
> detailed technical discourse that is useful to have when
> discussing nontrivial kernel + application issues.

Which was absolutely not what I was discussing in the first place.

S.
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list