[Gllug] A few words on the topic of stock spam
Martin A. Brooks
martin at hinterlands.org
Tue Jul 10 22:51:48 UTC 2007
Nix wrote:
> On 10 Jul 2007, Martin A. Brooks stated:
>
>
>> Nix wrote:
>>
>>> FWIW, FuzzyOCR with a pipeline that turns the jpegs into images and then
>>> OCRs them as usual does a reasonable job on this (if you ignore the
>>> hokey horrible method FuzzyOCR uses to identify spammy words: I really
>>> must get this stuff fed through Bayes like everything else).
>>>
>> Perhaps you misread a little. We're already past that stage, PDF spam is
>> all the rage.
>>
>
> I can't parse this at all. We're already past *what* stage? I never
> mentioned any sort of stage.
>
We're past the stage of FuzzyOCR being an effective method of picking
out this stuff.
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list