[Nottingham] agent idents for web strippers and email harvesters
leigh at rylands-internet-solutions.co.uk
leigh at rylands-internet-solutions.co.uk
Sun Oct 19 22:48:59 BST 2003
listers.
In a moment of lucidity a rather simple but effective ploy to claw back some
(although probably not all) of the battle back from the low lifes who strip web sites and
use email harvesting software to get email addresses for spamming occurred to me.
I have had this running on a number of sites for the past two days (only had basic
idea Friday morning) and it has already turned away an email harvester from one of
the sites.
While this system is building up an extensive database of bonafide browser types
what I really need is some HTTP_USER_AGENT strings for as many web strippers
and email harvesters as I can find.
I currently have:
WebStripper/2.58
autoemailspider
autoemailspider
EmailWolf 1.00
Mozilla/4.0 (compatible; Advanced Email Extractor v2.xx)
Mozilla/4.0 (compatible; MSIE 5.0; Windows NT; DigExt; DTS Agent
... but could do with more.
If you have a site that has been visited by one of these agents then please let me
have the HTTP_USER_AGENT identifier.
If any one is interested in having their sites protected with this system then I will
probably be in a position to add other sites to the network in a few weeks.
At moment only PHP pages can be protected although will extend this to ASP and
ColdFusion shortly.
I am currently working on refining the details of the service this is providing (eg auto
redirecting for WAP phones, WebTV etc) as well as developing a bit of load balancing
and redundancy by setting up a chain of servers that can handle the queries should
the one server that is running the system at the moment go down or suffer a load
problem.
Regards
Leigh Silvester
More information about the Nottingham
mailing list