[Nottingham] Web page scraping

Richard Morris richard at tannery.co.uk
Tue Jul 31 15:46:06 BST 2007


> -----Original Message-----
> From: nottingham-bounces at mailman.lug.org.uk [mailto:nottingham-
> bounces at mailman.lug.org.uk] On Behalf Of Martin
> Sent: 31 July 2007 15:38
> To: nottingham at mailman.lug.org.uk
> Subject: Re: [Nottingham] Web page scraping
> 
> Martin Garton wrote:
> > On Tue, 2007-07-31 at 15:14 +0100, Martin wrote:
> >> "Web page scraping":
> >>
> >> Anyone recommend any software for extracting info/tables from html
> and
> >> web pages?
> >
> > wget, awk, sed, grep.
> 
> Yes, and all glued together with scripting...
> 
> 
> This is the sort of thing that there just must be something along the
> lines of you download a page, highlight what you want, and then the
> item
> fields just magically pop out as pretty csv...
> 
Martin,

If it is a table of data that you want to capture, you could try the copy
function of Firefox and the paste function of OpenOffice Calc, it usually
makes a reasonably good go of copying the table.

Regards

Richard







More information about the Nottingham mailing list