[Gllug] Screen-scraping tools (for HTML)?
Richard Jones
rich at annexia.org
Tue Nov 23 22:11:55 UTC 2004
Our current project involves "automating" Google Adwords. I'm doing
this by screen-scraping the HTML (LWP + HTML::TreeBuilder + lots of
OCaml glue code). Google's HTML is hideous - so hideous in fact that
HTML::TreeBuilder misparses a lot of it, resulting in nasty
workarounds all over the place.
I'm thinking there must be an easier way ... Does anyone know of any
tools to help automating / screen scraping pages?
Rich.
--
Richard Jones. http://www.annexia.org/ http://www.j-london.com/
>>> http://www.team-notepad.com/ - collaboration tools for teams <<<
Merjis Ltd. http://www.merjis.com/ - improving website return on investment
http://subjectlink.com/ - Lesson plans and source material for teachers
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Digital signature
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20041123/3405eeac/attachment.pgp>
-------------- next part --------------
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list