[Watford] Watford Digest, Vol 212, Issue 1

Rob Jefferis rob at letchmore.co.uk
Thu Jun 14 09:05:43 UTC 2012


Thanks for the replies on this so far guys.

So a bit more detail:

The plan is that the boxes will be located at different sites to allow for DR, they could be in the same site temporarily to allow for the inital sync.

The current boxes are Synology 3412s which have an option to replicate shares to another box but we have had problems getting this to complete on large shares. 

There is a setting in one of the config files which is mentioned in the forums that seems to relate to the max amount of subfolders in a share that can be synced. The setting is in the synoinfo.conf file in \etc.defaults
s2s_watches_max = "102400"
I am pretty sure that the folder sync and backup options built into the box both use rsync as the underlying technology.

The OS is busybox v1.16.1


The boxes have 22 x 2TB Enterprise SATA disks split into multiple RAID 6 arrays. There is no option to add any kind of flash acceleration.

The link between sites is 100Mb

My concern over any kind of file based sync is the fact that it is having to inspect every file for change on every run.

Steve am I right in thinking that it defaults to timestamps rather than MD5? I dont think we have specified in our testing to use MD5. It would be nice to try and find the underlying code that synology are using in the background for their options.

Thanks


Rob




________________________________________
From: watford-bounces at mailman.lug.org.uk [watford-bounces at mailman.lug.org.uk] on behalf of Mat Sutton [mat at matsutton.co.uk]
Sent: 13 June 2012 18:48
To: watford at mailman.lug.org.uk
Subject: Re: [Watford] Watford Digest, Vol 212, Issue 1

Rob

First thoughts:

Do you actually need to replicate?  Why? What are you trying to achieve?

Spindles and connectivity are going to affect throughput. (Are both in same location, or could they be temporarily?)

Has any data 'expired' so is ready for deletion so can be ommitted from copy?

Can the NAS being copied from have Cache expanded to aid filesystem traversal.

Mat

Sent from my BlackBerry® smartphone

-----Original Message-----
From: watford-request at mailman.lug.org.uk
Sender: watford-bounces at mailman.lug.org.uk
Date: Wed, 13 Jun 2012 12:00:02
To: <watford at mailman.lug.org.uk>
Reply-To: watford at mailman.lug.org.uk
Subject: Watford Digest, Vol 212, Issue 1

Send Watford mailing list submissions to
        watford at mailman.lug.org.uk

To subscribe or unsubscribe via the World Wide Web, visit
        https://mailman.lug.org.uk/mailman/listinfo/watford
or, via email, send a message with subject or body 'help' to
        watford-request at mailman.lug.org.uk

You can reach the person managing the list at
        watford-owner at mailman.lug.org.uk

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Watford digest..."


Today's Topics:

   1. help replicating large amounts of data (Rob Jefferis)
   2. Re: help replicating large amounts of data (Steven Acreman)
   3. Re: help replicating large amounts of data (M Fernandes)


----------------------------------------------------------------------

Message: 1
Date: Wed, 13 Jun 2012 07:33:09 +0000
From: Rob Jefferis <rob at letchmore.co.uk>
Subject: [Watford] help replicating large amounts of data
To: "watford at mailman.lug.org.uk" <watford at mailman.lug.org.uk>
Message-ID:
        <6B8A078BD8A259429C8A783A109D3DC52A1E9927 at VMEXCH2.letchmoresystems.co.uk>

Content-Type: text/plain; charset="iso-8859-1"

Hi all.

Has anyone got any ideas on the best / quickest way to replicate large volumes of data between servers.

We currently have 2 NAS boxes and would like to replicate 1 to the other. The underlying technology for the sync is rsync and I dont think this is really going to handle the volume of data.

I have just done a count up on one of the larger shares and it is currently around 13TB with 1.5 million folders and around 100 million files.

We have toyed with building some large Windows file servers and using DFSR but I dont think this would handle it either.

Anyone got any ideas on software / hardware that might help? I would estimate the total required storage at primary to be around 50TB so a guestimate would put that at around 350 million files.

The data needs to be available to Windows machines.

Thanks

Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/watford/attachments/20120613/f59fce9b/attachment-0001.htm>

------------------------------

Message: 2
Date: Wed, 13 Jun 2012 08:39:02 +0100
From: Steven Acreman <sacreman at gmail.com>
Subject: Re: [Watford] help replicating large amounts of data
To: "watford at mailman.lug.org.uk" <watford at mailman.lug.org.uk>
Message-ID: <623B23CA-07A3-4654-B72E-0FD7FDCBB491 at gmail.com>
Content-Type: text/plain; charset="us-ascii"

I don't think there is anything better than rsync. It was written by a genius.

Just use the faster options like timestamps over md5.

Sent from my iPhone

On 13 Jun 2012, at 08:33, Rob Jefferis <rob at letchmore.co.uk> wrote:

> Hi all.
>
> Has anyone got any ideas on the best / quickest way to replicate large volumes of data between servers.
>
> We currently have 2 NAS boxes and would like to replicate 1 to the other. The underlying technology for the sync is rsync and I dont think this is really going to handle the volume of data.
>
> I have just done a count up on one of the larger shares and it is currently around 13TB with 1.5 million folders and around 100 million files.
>
> We have toyed with building some large Windows file servers and using DFSR but I dont think this would handle it either.
>
> Anyone got any ideas on software / hardware that might help? I would estimate the total required storage at primary to be around 50TB so a guestimate would put that at around 350 million files.
>
> The data needs to be available to Windows machines.
>
> Thanks
>
> Rob
> _______________________________________________
> Watford mailing list
> Watford at mailman.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/watford
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/watford/attachments/20120613/076acab3/attachment-0001.htm>

------------------------------

Message: 3
Date: Wed, 13 Jun 2012 10:44:44 +0100
From: M Fernandes <myitpartneruk at gmail.com>
Subject: Re: [Watford] help replicating large amounts of data
To: watford at mailman.lug.org.uk
Message-ID:
        <CAD_qgec8xSfXsDAzp8x6a7C+vTPBDPXFVD58cJeZyswUtM+CHg at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hello Rob / Steven,

Whilst I am in no way as experienced as either of you in Linux (you may
have seen one to two of my (comparatively) basic questions recently) but,
I've come across the following.  Rob I have no visibility and/or needs of
your setup of course, but, a-sharing we should be!  There's a review here
http://www.linuxjournal.com/article/7712 and the actual site is here
http://www.cis.upenn.edu/~bcpierce/unison/ .

I hope that helps you?  Anyway, back to my own Gentoo dilemmas.  I never
knew pain could be so good!  ;-)

Mike

On 13 June 2012 08:39, Steven Acreman <sacreman at gmail.com> wrote:

> I don't think there is anything better than rsync. It was written by a
> genius.
>
> Just use the faster options like timestamps over md5.
>
> Sent from my iPhone
>
> On 13 Jun 2012, at 08:33, Rob Jefferis <rob at letchmore.co.uk> wrote:
>
> Hi all.
>
>  Has anyone got any ideas on the best / quickest way to replicate large
> volumes of data between servers.
>
>  We currently have 2 NAS boxes and would like to replicate 1 to the
> other. The underlying technology for the sync is rsync and I dont think
> this is really going to handle the volume of data.
>
>  I have just done a count up on one of the larger shares and it is
> currently around 13TB with 1.5 million folders and around 100 million files.
>
>  We have toyed with building some large Windows file servers and using
> DFSR but I dont think this would handle it either.
>
>  Anyone got any ideas on software / hardware that might help? I would
> estimate the total required storage at primary to be around 50TB so a
> guestimate would put that at around 350 million files.
>
>  The data needs to be available to Windows machines.
>
>  Thanks
>
>  Rob
>
> _______________________________________________
> Watford mailing list
> Watford at mailman.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/watford
>
>
> _______________________________________________
> Watford mailing list
> Watford at mailman.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/watford
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/watford/attachments/20120613/e1830ee7/attachment-0001.htm>

------------------------------

_______________________________________________
Watford mailing list
Watford at mailman.lug.org.uk
https://mailman.lug.org.uk/mailman/listinfo/watford


End of Watford Digest, Vol 212, Issue 1
***************************************
_______________________________________________
Watford mailing list
Watford at mailman.lug.org.uk
https://mailman.lug.org.uk/mailman/listinfo/watford



More information about the Watford mailing list