[GLLUG] Transferring high volumes of data.
Christopher Walker
C.J.Walker at qmul.ac.uk
Wed Jun 11 09:11:31 UTC 2014
On 11/06/14 09:07, Martin A. Brooks wrote:
> On Wed, June 11, 2014 01:15, JLMS wrote:
>> I would appreciate any ideas, pointers, etc that may make possible to
>> transfer such amounts of data in an efficient manner as possible.
> Fedex.
Some particle physicists use phedex[1]. Any similarity in name is, I'm
sure, entirely intended.
Seriously though, fasterdata.es.net is a very good place to start.
For fast links with high latency, you need to increase the maximum tcp
window size - and use software that can make use of that in order to
take full advantage of the link. Notably, scp cannot.
In addition, you need to eliminate bottlenecks - 100Mbit links in
supposedly Gbit connections, firewalls holding things up, disk speed at
source and destination etc, using different .
It is perfectly possible to do this - we do so routinely, but it has
taken a nontrivial amount of my time. We use the recommended settings at
fasterdata.es.net, and have done lots of testing with iperf.
We use globus gridftp to transfer data, and transfers are scheduled by
the File Transfer Service (FTS) - which schedules multiple files to be
transferred at the same time.
Globusonline may be an easier alternative for you. Aspera sell something
in this space - which AIUI uses UDP to transfer data, rather than TCP,
so is less sensitive to packet loss. I've no particular experience with
either of these.
A saturated Gbit link can transfer 10TB in 24h. Achieving link
saturation is difficult - and likely to annoy other users of the link.
Chris
[1] https://cmsweb.cern.ch/phedex/
More information about the GLLUG
mailing list