[Nottingham] Spam Filtering

Thu May 27 03:07:20 BST 2004

Martin <martin at ml1.co.uk> writes:
> Michael Leuty wrote:
> > On Wed, 2004-05-26 at 09:10, Duncan John Fyfe wrote:
> > >What is appropriate for business use and home use ?
> [...]
> > As an alternative, would it be practical for me to set up some software
> > (is it "fetchmail"?) to collect the mail from my ISP, run it through
> > TMDA on my machine, and then collect the autheticated mail using
> > Evolution?
>
> Fetchmail -> Postfix is trivially easy.

At home, using popsneaker as a preconnect option to fetchmail, I used
to delete approx 90% of my spam, based purely on regular expression
matching of headers. Until recently, I had two mistakes in about 6000
deletions. The rules needed to be updated several times a week and were
ruthless, like

deny "^(Message-ID|From): .*yahoo\.com\.hk"
deny "^Subject: .*cheap"

I would be struggling to write re's to match most of the current spam.
They're getting better at disguising them.

Fortunately, my isp (freeserve/wanadoo) have started to label spam
before delivering it. They are currently recognising about 99% of it
and improving almost weekly. I've not had a single false positive
from them in about 5000 spams.

So now, the main re that popsneaker needs in its filter is simply

deny "^Subject: \*\*\* SPAM \*\*\*"

This method also avoids downloading the bodies of spam which saves a lot
of wasted time on a slow modem link too.

I wouldn't recommend this setup (without freeserve/wanadoo's help)
for anyone who cannot afford to lose a message - it would struggle to match
sufficient spam now, even when applied ruthlessly.

It would be interesting to hear what freeserve/wanadoo are using.
They must process something like a factor of 10^7 more messages than I
do in a day.

Ted.
-- 
Ted <ted at nowtsfree.freeserve.co.uk>
   http://www.nowtsfree.freeserve.co.uk