[Sderby] shell script for retrieving an image file from aremote http server

Sun Jul 4 01:15:26 BST 2004

----- Original Message ----- 
From: "Mike Hemstock" <hemstock at tiscali.co.uk>
To: "South Derby LUG General Mailing List" <sderby at mailman.lug.org.uk>
Sent: Saturday, July 03, 2004 6:21 PM
Subject: Re: [Sderby] shell script for retrieving an image file from aremote
http server

> On Saturday 03 July 2004 10:34, Ashley Heath wrote:
> > On Sat, 3 Jul 2004 00:12:45 +0100
> >
> > "Paul Grosse" <paul-grosse at ntlworld.com> wrote:
> > > Folks,
> > >
> > > Does anybody know of a way (using commands that are likely to be found
on
> > > most people's machines) of retrieving an image file from a remote http
> > > host? The commands would have to be representable in the form of a
bash
> > > script (suitable for running as a cron tab) and I would prefer not to
> > > have to have people loading programs onto their machines in order to
> > > achieve this.
> > >
> > > One thing that sprung to mind was the use of telnet (having the telnet
> > > command in the shell script) with commands sent to it from its
.telnetrc
> > > file, redirecting the output to a file which could then have the first
> > > bits of code removed as they are not part of the image.
> > >
> > > Doing this manually, it is something along the lines of...
> > > +++++++++++++++++++++++++++
> > > ==In BASH...
> > > telnet www.xxx.yyy.zzz 80 >[some_directory]/image.png
> > >
> > > ==In Telnet...
> > > GET /image.png HTTP/1.0[return]
> > > Host:www.domain.com[return]
> > > [return]
> > >
> > > ==In [some_directory], have a program that...
> > > opens the file, finds the content-length data in the header returned
by
> > > the server.
> > >
> > > Then, (possibly another program or line(s) of script) starts from the
end
> > > and counts that number of bytes towards the beginning then trunkates
the
> > > file, leaving just the image in the file (ie, chops the first bit off,
> > > leaving the rest).
> > >
> > > ==Then, in [some directory] rename the file with the date and time
(say,
> > > in the form of 20040703000335.jpg)
> > > +++++++++++++++++++++++++++
> > > and that is all.
> > >
> > > Is there a bit of perl that is something along the lines of
lwp-retrieve
> > > ... that will do the first part of this?
> > >
> > > I think that using telnet is a bit long winded although I don't want
to
> > > have people loading libraries if they have a nice tight system and
don't
> > > want loads of other stuff there that might just compromise their
> > > security.
> > >
> > > The idea is that people with web counters can retrieve the image for a
> > > certain time of the week, every week without having to do it
manually -
> > > the only manual part is inspecting the images to collect the counter
> > > values.
> > >
> > > Any ideas anyone???
> >
> > It would be far easier to use wget (non interactive network downloader),
> > should be standard on all Linux distros eg wget
> > www.xxx.yyy.zzz/[some_directory]/image.png
> >
> > Cheers
> > Ash
>
> I'll second that.  Alternatively, you could tell the users to open up
their
> browsers and look at the frigging thing themselves :-)
>
> Mike.

That actually works better than I had anticipated - a little bit of luck
with the way that the files are stored.

As I've said before, the idea is that a web counter is queried at certain
times of the week and the number stored for later inspection.

It just so happens that when wget retrieves the image, it is presented (from
fast counter) to the client with the counter value as part of the name of
the file. Also, it is stored with the date/time stamp (as usual) so that
part of the info is there as well. In the end, the post processing of having
the name of the file containing the date/time and then having the user
inspect the image for the number are both made redundant as this info is
stored as the date/time stamp and the file name respectively. Also, with the
file name being different (by definition) files do not have to be renamed.

With nothing else that needs doing, the (now) single line of code can go in
a crontab.

At the prompt, I used...
>wget --proxy=off http://[counter_url] -P /home/paul/wget -a
/home/paul/wget/fc_log

and all of the info can be seen from Konqueror (without opening any files)
or from a shell as using
>ls -l

It can also be seen from the log file. --proxy is used because I normally go
through squid and it won't work otherwise.

The site that my fast counter is on (the one I'm trying this out on) gets
between 20,000 and 100,000 hits a week (depending on the time of year -
between 150 and 500 visits per day {minima, as proxies mess everything up})
so adding two per week (or even one per day) isn't going to boost the
figures substantially.

This was with fastcounter. Does anybody know if it works differently with
any others (ie, the count value not being in the name)?

Regards,

Paul Grosse