[Sderby] Procmail & Spam assasin

Wed Jun 25 11:36:01 2003

On Tue, 24 Jun 2003 19:10:33 +0100, you wrote:

>Mike,
>
>Which MTA are you using?  I assume either Postfix or Sendmail?  Have you
>perhaps looked at Exim4?  It has, part of the MTA a component called =
Exiscan
>which works a treat on my mailservers.  I have used SpamAssasign, but =
find
>Exisan works great for my needs.
>
Eeek.  ExiScan is a great bit of kit, but unless you *need* to do an
SMTP reject based on whether it's thought to be spam is *hideously*
overkill.  It has it's place (which is why it exists) but by far the
easiest way is to call spamassassin from your .procmailrc.  This has
the advantage that you can use the new bayesian filtering and correct
it's wrong guesses, as you haven't rejected the mail, you get to build
up a spam corpus which after a while makes spamassassin really
accurate.

I've got some excellent procmailrc rules (below) for splitting up
mailing lists into mailboxes too.  These rules are fairly process
bound, but are completely generic - you just need these rules to
manage all your mailing lists, now and in the future, they beat having
to add a new procmailrc entry for each list you subscribe to.

They will get procmail to place the messages into mailboxes named by
the root of the mailing list (south derby LUG posts go into a mailbox
called "sderby") and these mailboxes are created in the "lists"
subdirectory of wherever your mail directory is.

--- begin procmailrc snippet

:0:
* ^X-BeenThere: \/[^@]+
lists/`echo $MATCH | sed -e 's/[\/]/_/g'`

:0:
* ^Delivered-To: mailing list \/[^@]+
lists/`echo $MATCH | sed -e 's/[\/]/_/g'`

:0:
* X-Mailing-List: <\/[^@]+
lists/`echo $MATCH | sed -e 's/[\/]/_/g'`

:0:
* X-Loop: \/[^@]+
lists/`echo $MATCH | sed -e 's/[\/]/_/g'`

--- end procmailrc snippet.

I've also got some mutt config lines that go with it, available on
request so that mutt auto-handles all lists for you.  But as no-one
except me seems to use a non-GUI mailer, I won't waste any more
bandwidth.  Mail me privately for details...

If you do go down the route of spamassassin - which I'd thoroughly
recommend; in my experience its a stunningly good piece of software,
even without the bayesian filtering that got added a while ago - then
don't use spamassassin in standalone mode, but use the spamd facility
instead.  The reason for this is that spamassassin is written in perl,
and if you call it for every sinlge piece of mail, it can very quickly
bog down the system because it spends most of it's time loading and
unloading perl - I had a flood of 100 messages come from a stalled
mailserver for a list, and it completely stuffed my Athlon 1700+
server, pinning it to 100% processor usage, and exhausting all 900Mb
RAM.  Just a friendly warning :)

Hope this helps.

Dave.