[Nelug] Spam Filtering
Martin Ward
Martin.Ward at durham.ac.uk
Tue Apr 13 22:17:43 UTC 2004
I have just implemented a perl module and some supporting scripts
for spam filtering. The module is called Mail::SpamFilter
and is available from my web pages:
http://www.cse.dmu.ac.uk/~mward/martin/software
http://www.dur.ac.uk/martin.ward/martin/software
The module presents a uniform interface for passing a message through each
filter and determining which filters consider the message to be spam
The spamcheck script passes a copy of the given message to each filter and
counts how many filters consider it to be spam. It adds a X-SPAM-Votes: header
with the total.
I currently delete everything with three or more votes and quarantine
everything with one or two votes using these procmail rules:
:0fw: spamcheck.lock
| spamcheck
# Record the votes in the procmail log file:
:0
* ^X-Spam-Votes: \/.*$
{ LOG="Spam-Votes: ${MATCH}" }
# Junk anything that 3 or more scanners give a positive result on.
:0
* ^X-Spam-Votes: [3456789]
/dev/null
# Filter anything which any scanner considers to be spam:
:0
* ^X-Spam-Votes: [12]
SPAM
The isspam and notspam scripts can be used to train your filters. Any spam
message which is missed by any filter can be passed to isspam while false
positives should be passed to notspam.
The spam filters it currently knows about are:
* SpamAssassin
* The CRM114 Discriminator
* Nuclear Elephant: DSPAM
* WPBL - Weighted Private Block List
(See the web page for URLs for each of these).
Let me know what you think of it!
--
Martin
Martin.Ward at durham.ac.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/
More information about the Nelug
mailing list