[Gllug] A few words on the topic of stock spam

Martin A. Brooks martin at hinterlands.org
Tue Jul 10 22:51:48 UTC 2007


Nix wrote:
> On 10 Jul 2007, Martin A. Brooks stated:
>
>   
>> Nix wrote:
>>     
>>> FWIW, FuzzyOCR with a pipeline that turns the jpegs into images and then
>>> OCRs them as usual does a reasonable job on this (if you ignore the
>>> hokey horrible method FuzzyOCR uses to identify spammy words: I really
>>> must get this stuff fed through Bayes like everything else).
>>>       
>> Perhaps you misread a little. We're already past that stage, PDF spam is 
>> all the rage.
>>     
>
> I can't parse this at all. We're already past *what* stage? I never
> mentioned any sort of stage.
>   

We're past the stage of FuzzyOCR being an effective method of picking 
out this stuff.

-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list