[Sussex] Apache throttling/fair use

Mark Harrison (Groups) mph at ascentium.co.uk
Sun Jan 23 22:09:10 UTC 2005


Thomas Adam wrote:

> http://www.leekillough.com/robots.html
>
>Is quite good.  And shows you how to negate an erroneous or faked
>userAgent string...
>
>-- Thomas Adam
>
>  
>
Seems a good page, if dated.

The biggest problem these days is that so many users, particularly 
hostile types, have dynamic IP addresses or use proxy servers, so if you 
simply "block hosts that ask for too much", then you block that host for 
that session, and then block the innocent, legitimate user who "dials 
in" five minutes after they've disconnected (I use the term dials in in 
quotes so we don't forget how brain-damaged, say, BT BrokenWorld is in 
allocated IP addresses to broadband users.)

On my sites, I've found that probes for known exploits are a far more 
common thing than robots. There again, I -want- my site to be fully 
indexed by every search engine in the English-speaking world, so my 
robots.txt is basically a "come on in, read it all boys" :-)

M.




More information about the Sussex mailing list