[Liverpool] Software patent data
Aidan McGuire
amcguire at bluefountain.com
Mon Apr 3 09:56:10 BST 2006
if its useful i will offer to do the web site.
Aidan
On 2 Apr 2006, at 11:27, Julian Todd wrote:
> tony burrows wrote:
>> After the last meeting I started playing with the idea of getting
>> data from the patent office. Gave me an excuse to learn Python as
>> well.
>>
>> Now currently a can take a downloaded page and grab the relevant
>> data into an xml file. I know how to use python to stuff it into
>> MySQL, but I have hit a couple of problems.
>>
>> First, I'm not sure how to navigate around pages automatically so
>> that I can grab stuff without having to do it all manually through
>> a browser.
>
> All you need is urllib.urlopen(), read(), urlparse.urljoin() and
> some regexp knowledge to get whatever you want from the internet,
> spider around it, and capture the data.
>
> http://docs.python.org/lib/module-urllib.html
>
>
> That's how I've done it for the whole of publicwhip. Arrange a
> date from me if you want to know how to get started. The technical
> term for what you are trying to do is making the data accessible.
> So, downloading all the data, adding a proper search engine, and
> reposting it in a useable form is not violating the copyright, it's
> making it accessible for people who can't handle their interface.
> Or so goes the argument. It hasn't been tested in court, but the
> moral defense is: if the patent office is willing to take on these
> improved capabilities which people need, then you will take your
> website down. It should be as legal as caching the webpages for
> quicker access.
>
> Julian T.
>
>
>
> _______________________________________________
> Liverpool mailing list
> Liverpool at mailman.lug.org.uk
> https://mailman.lug.org.uk/mailman/listinfo/liverpool
More information about the Liverpool
mailing list