[Gllug] Raw Partitions

Peter Grandi pg_gllug at gllug.for.sabi.co.UK
Tue Nov 22 22:59:14 UTC 2005


>>> On Thu, 17 Nov 2005 11:40:12 +0000, Steve Nelson
>>> <sanelson at gmail.com> said:

[ ... ]

sanelson> You constant nit-picking helps no-one on the list.

Well, somone's nitpicking is someone else's attempt to avoid
ambiguities that cost dozen of lines of pointless exchange.

This discussion is about some fairly technical issues, and there
are so many parameters and possibilities and combinations that
_some_ degree of precision (and context) helps greatly.

sanelson> I clearly meant "reading a combination of the raw and
sanelson> sg_dd manual pages, together with other documention
sanelson> leads me to believe that for performing read/write
sanelson> operations on a raw device sg_dd is a recommended
sanelson> tool.

What you clearly meant may not be so clear to someone who is a
poor untelepath... To me, relying only on your written words,
the mention of 'sg_dd' clouded the issue as to SGIO to SCSI
discs or 'raw'(8), also because there was initially not enough
mention of the context (Oracle/GFS) that would have swung the
odds in favour of 'sg_dd' being incidential and 'raw'(8)
essential, and not viceversa...

And this is not nitpicking, just explaining why it is best if
one when asking for help should ask with clarity and details.

>> I have provided several links to versions of 'dd' that do not
>> have that same limitation, and are not 'sg_dd'.

sanelson> So what?  I didn't want them.

Those were merely to illustrate that given that there are now
many tools that do aligned buffer IO, the mention of the only
one of them that does SGIO was giving some misleading hint as to
the troubleshooting you wanted to do on 'raw'(8).

[ ... ]

sanelson> [ ... ] Had this question been about a GFS pool I
sanelson> would have mentioned it.

Let me say that it seems to me that your record so far as to
asking questions with relevant context and details is a bit
below ''clueless''.

But you did indirectly:

  sanelson> As I understand it, it is possible to bind a raw
  sanelson> device [ ... ] This is consquently used to perform
  sanelson> cache-bypassing IO operations, and forms part of the
  sanelson> design of Redhat's GFS 6.0

And where would one do cache-bypassing for RH GFS? For pools
usually (with 'O_DIRECT' though)...

[ ... ]

>> So, for example, if the test system has a swap partition, you
>> can experiment with the raw device wrapper by binding one raw
>> device to such a partition [ ... ]

sanelson> AT LAST!  Something genuinely helpful.  Thank you.

But this confuses me -- this is a trivial, obvious point for
which I was braced for ''of course I know that''.  Since it is
so obvious I was wondering whether you had any special reasons
for not doing it already, and that's the only reason I mentioned
it.

sanelson> [ ... ] It is clear by now I am using RHEL 3 with a
sanelson> 2.4 kernel. Raw devices are *not* deprecated under
sanelson> this system.

But you omitted to mention both RH and 2.4 initially, that's why
I mentioned it was deprecated, and thanks a lot for blasting
people because you are too clueless to give enough context.

Also please use "deprecated" in a less confused way, as it is
not the same as "unsupported"; in a technical context its
meaning is:

  ''it is still in the current and previous version, but it will
    be removed in some future version''

   or: http://WWW.Hacker-Dictionary.com/terms/deprecated

   In Linux kernel terms, something is deprecated if it appears
   at some point in 'Documentation/feature-removal-schedule.txt'.

In any case that it will be removed soon in the 2.6 series is
still relevant, even for RH and 2.4; because deprecated stuff
tends to be used less, and whether stuff actually works, and in
corner cases, depends on how much use it gets, and if it is
''deprecated'' developers avoid it, or don't test it as much.

[ ... the GFS 6.0 manual and O_DIRECT ... ]

sanelson> [ ... ]  You however, appear to have cherry-picked
sanelson> from it, [ ... ]

No, I simply did a string search for "raw" or "quorum" in the
GFS manual and nothing relevant came up, but "O_DIRECT", the
direct alternative to 'raw'(8), did come up. This suggests to
me that is the standard way to bypass the cache on GFS.

>> But probably I am missing something here... Indeed, as it
>> seems likely that the quorum is the _Oracle_ quorum, as one
>> can read in section 2.6, page 16 of «Installing and
>> Configuring Oracle9i RAC with GFS 6.0»: [ ... ] where the raw
>> devices are bound _on top_ of GFS pools:

sanelson> This is not always the case.

But it is the case mentioned in the manual relevant to the
products you have mentioned, and since you seem to like playing
the ''not telling you'' game, that's all I have got.

As to the guessing game, I am now guessing that you are using a
third product, the RH CM, which _also_ has a quorum and which
also aims to bypass the cache, and with 'raw'(8) like GFS.

>> and I am not that surprised -- '/dev/raw/' are a relatively
>> obscure part of the kernel and would not expect them to be as
>> thoroughly exercised as they should be; never mind stacking
>> 'raw'(8) on top of GFS pools...

sanelson> I have no idea what you mean.

Let's try again: practically minded people use sw that is widely
used so that it has been thoroughly exercised (and even so...).
Combinations of sw that are not widely used (e.g.: on several
hundred thousand desktops) are to be regarded as ''brave''.

In particular where there are parallel accesses, and multiple
layers of wrapping and virtualization. That's basically why only
very specific combinations of kernel/CM/CFS/DBMS versions and
types are ''qualified'' (which sometimes means just ''it worked
here for 5 minutes'' :->) by vendors.

Also, RH EL 3 comes with a kernel (if you are using the standard
kernel) which has an optimization with a well known limitation
that usually manifests itself when there are several complicated
software subsystems running concurrently. Unfortunately this
limitation has different impacts on different systems and
architectures.

[ ... detailed questions ... ]

sanelson> Yes - if I was asking for help with troubleshooting my
sanelson> cluster - which, incidentally, I am not.

But you are asking with help in setting up a troubleshooting
environment for crashes related to 'raw'(8), and how 'raw'(8) is
used and by which app and in which context is highly relevant.

In particular, many issues happen only because of combinations
of usage patterns, and in order to troubleshoot devices one
should try to recreate an environment as similar as possible
to the production one.

  Naturally you can believe oherwise, as you indeed seem to, but
  then ''clueless is as clueless does'' :-).

And how it is used and in which context might suggest fallbacks,
because if 'raw'(8)' crashes, the product using it might well be
able to use alternatives or not use 'raw'(8) at all, and not
using something that crashes is also a way to fix it, and to
sidestep any issues as to setting up 'raw'(8) troubleshooting.

[ ... ]

sanelson> [ ... ] It also shows that you don't appear to
sanelson> understand how quorums work in Redhat Clustersuite.

No, it is rather that you seem foolishly keen to show off just
how thoroughly confused you are, as this is the first time you
mention that product, while so far you have been stating
unambiguosly (yet confusedly) that you have had problems with
GFS 6.0 raw devices and/or with Oracle:

  sanelson> As I understand it, it is possible to bind a raw
  sanelson> device [ ... ] This is consquently used to perform
  sanelson> cache-bypassing IO operations, and forms part of the
  sanelson> design of Redhat's GFS 6.0

  sanelson> The current situation is a GFS 6.0 cluster, and raw
  sanelson> devices, wrapped as described above, are used for
  sanelson> the quorum data.

  sanelson> concerning an Oracle 10 Cluster with a failing node,
  sanelson> which my investigations thus far lead be to believe
  sanelson> is crashing when it attempts to read quorum data
  sanelson> from a raw device,

The combination of these statements and the lack of previous
mention of RH CS seems fairly unambiguous to me...

Well, RH CS is yet another cluster-related package, but it is a
completely different product from RH GFS, as you could have
gleaned from reading its presentation here:

  http://WWW.RedHat.com/en_us/USA/home/solutions/clustersuite/

   «For applications that require maximum uptime, a Red Hat
    Enterprise Linux cluster with Red Hat Cluster Suite is the
    answer. Specifically designed for Red Hat Enterprise Linux,
    Red Hat Cluster Suite provides two distinct types of
    clustering:

    * Application/Service Failover - Create n-node server clusters
      for failover of key applications and services

    * IP Load Balancing - Load balance incoming IP network
      requests across a farm of servers

    [ ... ] For high-volume open source applications, such as
    NFS, Samba, and Apache, Red Hat Cluster Suite provides a
    complete ready-to-use failover solution.»

Now, I understand that to someone really clueless it is easy to
confuse thingies that are all called ''cluster-something'' even
if they do completely different things like RH CS (network app
load balancing and failover service), RH GFS (network file
system) and Oracle Cluster (semi-distributed DBMS).

  Looking back it seems that the confusion has been in this
  thread for a while, and I should have noticed that while
  initially you mentioned 'clumanager' (from RH CM) you switched
  immediately thereafter, as if they were related, to talking
  about RH GFS and raw devices.

But these are indeed completely different packages, independent
of each other, and it can make a big difference exactly which
one is being used, because if you are trying to create a raw
device troubleshooting setup what kind of usage patterns they
get subjected to matters a great deal; and working around a raw
device problem, if any, requires different configuration for
each.

So while I don't like nit-picking, your confusions of ideas and
terminology have wasted quite a bit of your and my time.

However, going back to the ''combination of factors'' note
above, I would not be surprised about mishaps on a system that
has got all three of RH CM, Oracle DBMS and RH GFS running (and
who knows what else), as several of them may be using raw
devices concurrently, especially if not exactly the combination
of versions and system setup that has been ''qualified''.

So I'd look more at combination issues than at troubleshooting
'raw'(8) in isolation, and [guessing wildly] in particular at
that RH EL 3 kernel limitation, that has bitten quite a few
people doing complicated RH EL 3 deployments...

[ ... ]

-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list