[Gllug] How do I do this?
Martin A. Brooks
martin at hinterlands.org
Mon Jan 17 18:16:42 UTC 2005
Nix wrote:
>On Mon, 17 Jan 2005, Martin A. Brooks suggested tentatively:
>
>
>>sa-learn isn't the quickest running program ever, especially on large
>>numbers of spam messages. If you run it often, it's probably worth
>>adding a lock file check to avoid multiple runs on the same messages.
>>
>>
>
>Er, it takes very little time for sa-learn to recognise that it's seen a
>message again; it doesn't try to learn the same messages over and over.
>
>
You misunderstand.
The script works by going through every message in the designated IMAP
folder, retrieving it and then feeding the message to sa-learn with "no
sync" set. Once all of the messages have been learned if then deletes
all the messages, one by one, and finally syncs the bayesian db.
If this process takes longer than the interval between runs of the
script then it's possible for the same message to be sent back to
sa-learn. It won't be included, of course, but sa-learn still has to do
the work to figure that out which takes some time, perhaps not a lot as
you say.
My suggestion of a lock file is purely to avoid this scenario, the
script should really do nothing and exit if the lock file is present.
I tripped over this problem when I dropped about 6000 spam messages into
the "to learn" folder on a very low powered box with the script set to
run at 15 minute intervals.
Mart.
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list