[Gllug] How do I do this?

Martin A. Brooks martin at hinterlands.org
Mon Jan 17 18:16:42 UTC 2005


Nix wrote:

>On Mon, 17 Jan 2005, Martin A. Brooks suggested tentatively:
>  
>
>>sa-learn isn't the quickest running program ever, especially on large
>>numbers of spam messages. If you run it often, it's probably worth
>>adding a lock file check to avoid multiple runs on the same messages.
>>    
>>
>
>Er, it takes very little time for sa-learn to recognise that it's seen a
>message again; it doesn't try to learn the same messages over and over.
>  
>

You misunderstand.

The script works by going through every message in the designated IMAP 
folder, retrieving it and then feeding the message to sa-learn with "no 
sync" set. Once all of the messages have been learned if then deletes 
all the messages, one by one, and finally syncs the bayesian db.

If this process takes longer than the interval between runs of the 
script then it's possible for the same message to be sent back to 
sa-learn.  It won't be included, of course, but sa-learn still has to do 
the work to figure that out which takes some time, perhaps not a lot as 
you say.

My suggestion of a lock file is purely to avoid this scenario, the 
script should really do nothing and exit if the lock file is present.

I tripped over this problem when I dropped about 6000 spam messages into 
the "to learn" folder on a very low powered box with the script set to 
run at 15 minute intervals.

Mart.
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list