[Sussex] Apache throttling/fair use
Mark Harrison (Groups)
mph at ascentium.co.uk
Sun Jan 23 22:09:10 UTC 2005
Thomas Adam wrote:
> http://www.leekillough.com/robots.html
>
>Is quite good. And shows you how to negate an erroneous or faked
>userAgent string...
>
>-- Thomas Adam
>
>
>
Seems a good page, if dated.
The biggest problem these days is that so many users, particularly
hostile types, have dynamic IP addresses or use proxy servers, so if you
simply "block hosts that ask for too much", then you block that host for
that session, and then block the innocent, legitimate user who "dials
in" five minutes after they've disconnected (I use the term dials in in
quotes so we don't forget how brain-damaged, say, BT BrokenWorld is in
allocated IP addresses to broadband users.)
On my sites, I've found that probes for known exploits are a far more
common thing than robots. There again, I -want- my site to be fully
indexed by every search engine in the English-speaking world, so my
robots.txt is basically a "come on in, read it all boys" :-)
M.
More information about the Sussex
mailing list