[Gllug] Re-access broken remote connection to process?

John Hearns hearnsj at googlemail.com
Sun Jun 20 08:20:18 UTC 2010


On 20 June 2010 09:14, John Hearns <hearnsj at googlemail.com> wrote:
> On 20 June 2010 08:34, James Courtier-Dutton <james.dutton at gmail.com> wrote:
>>
>>
>> Just out of curiosity, what sort of application needs human
>> intervention every few hours.
>> Can the job be broken into multiple smaller jobs.
>
> I have helped people set up "computational steering"

Here's a better explanation of computational steering:

http://code.google.com/p/computational-steering/wiki/ComputationalSteering

It is also quite common to examine output logs of long running jobs -
to check that the solution is not diverging.
Checkpointing for long running jobs is important too - you might want
to stop the job, and re-run from a given point with different
parameters, or indeed guard against hardware failure - if your job
runs for N days you might be unhappy if the system crashed when it was
five minutes from the end!
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list