[sclug] Writing files to 2 places at once?

Roland Turner SCLUG raz.fpyht.bet.hx at raz.cx
Sat Mar 4 13:23:34 UTC 2006


On Fri, 2006-03-03 at 19:07 +0000, Sapan Ganguly wrote:

>         Trying not to top-post, see below.

The objection to top-posting is typically not just that the reply is at
the top, but that it isn't in context. Full marks for effort though :-)

>         Roland, I didn't build the current Oracle server and I'm not a
>         DBA so I'm interested in the tablespace replication facilities

Regretably my personal contact with Oracle has been as a developer, not
as a DBA, so I'm not able to offer specifics, but I'm pretty certain
that I've seen this done, without dependence upon LVM or similar.

>          you mentioned.  We have a new DBA, he seems to think that we
>         only have two options -
>         
>         1. We take a big downtime hit, shutdown the database copy all
>         the datafiles across and start up again
>         2. We do each tablespace seperately, put the tablespace into
>         backup mode, copy the datafiles across and the apply the redo
>         logs that build up in the mean time.
>         
>         Both these solutions will end up with the same amount of
>         downtime whether it be all in one go or split up over a few
>         days.

I'm not clear on why the second option involves downtime; will the
server not process queries while applying redo logs? Even if so, does
not breaking up the downtime into bite-sized pieces that can be incurred
over days or weeks at whatever are your lowest load periods not beat
taking all of the downtime at once?

What is the policy for scheduled maintenance windows? If you don't have
one, getting an agreed weekly maintenance window is probably a
worthwhile first step! Zero-downtime (well, five nines) systems are
possible, but require enormous investments in redundant equipment
(assume 4-8 times what's required for a non-redundant setup),
facilities, bandwidth and personnel (DR planning, DR drills, ...). From
what you've said, that kind of investment is out of the question.
Consequently so is zero downtime. Consequently scheduled maintenance
windows are a no-brainer, and therefore "chop it into bite-sized chunks
that will fit into the maintenance windows" is feasible, once you have
realistic maintenance policies in place.

You may also have a concern with some time pressure (current disk
filling). Bear in mind that, if you do take an incremental approach,
large chunks of space on the existing disk are freed up at each
increment.

In any event, this wasn't what I was proposing. I have the distinct
(although conceivably mistaken) impression that Oracle is capable of
maintaining duplicate copies of tablespaces on its own, that you should
be able to add new tablespaces to a running server and that you should
be able to alter the constellation of duplicates maintained for a given
table at will. That said, if your DBA doesn't know to do this, isn't
confident about it, or indeed, is confident that it's not possible,
don't try it :-)

Someone else has pointed out that you can get LVM up and running with
existing volumes with minimal downtime (how long does a server restart
take you?) and just let LVM fill in the mirror while it's in use. If you
can work out how to do this, then this may be your best option.

>         Oh, and yes, the short window does reflect a technical
>         challenge I have taken on voluntarily, although we can
>         probably could get a whole day of downtime this wouldn't look
>         good for us techie's and our customers have been sold a
>         24x7x365 service and I would like to do my best to provide
>         this even though I don't have the infrastructure for
>         it.....yet.

A 24*365 service doesn't (shouldn't) mean zero downtime; it just means
that support is available around the clock. Naturally, salespeople are
wont to tell customers whatever will make a sale, and some customers
have even been known to exagerate what they were sold on occasion.

{{ from a later post }}

> Whoever designed this whole setup did think about fault
> tolerance/resilience but I think they were in rush

As you fully appreciate, thinking about it isn't enough!

You aren't by chance working for a company with a data centre in Whitley
are you?

- Raz



More information about the Sclug mailing list