[Sussex] SPAM Filtering Revisited

Mon Aug 21 13:39:53 UTC 2006

On Mon, Aug 21, 2006 at 01:32:32PM +0100, Steven Dobson wrote:
> Andy
> 
> On Mon, 2006-08-21 at 11:03 +0000, Andy Smith wrote:
> > On Mon, Aug 21, 2006 at 08:01:30AM +0100, Steven Dobson wrote:
> > > Given that CBV is not as onerous as a DSN storm aren't you better off
> > > accepting the CBV load?  CBV requires no human intervention.  Aren't
> > > completely automatic the best way to stop spam?
> > 
> > Why do I have to accept either?
> 
> Because if you're not part of the solution you're part of the problem. A
> prime example would be an open relay.  It may not generate spam but it
> surely does distrobute it!

Since I am not the one running the open relay and I am not the one
sending you the forged emails I'm not sure where the parallel lies.
I can justify a lot of actions under the banner of "well if you
don't want to be part of the solution..!"  Whose solution?

> > There is no requirement for me to receive DSNs from this.
> 
> I think so - isn't the situation that you talk about below?

If you receive an email purportedly from sdfknj4u at strugglers.net and
it turns out to be spam as determined by your systems during the
SMTP conversation with whatever compromised machine sends you it,
your machine can reply with a 5xx response and no one gets a DSN.

If you accept it and then later decide you aren't going to deliver
it then my systems will receive a DSN from you.

If you engineer things correctly I do not need to receive DSNs in
the majority of cases.

> Please concider this.  If my MTA isn't quiery your MTA to do a CBV then
> one of three thing can happen:
> 
>   1). The spam has an undeliverable local part to my domain and is
>       rejected.  As per the RFC the MTA upstream from me generates a
>       DSN that it couldn't deliver the message to me.

Not if it happens during SMTP which is better for all concerned.
Note that if you send me a DSN to an address that is not existing at
my site then your mailer-daemon receives a double bounce - it's in
your interest to sort this out!

This is the best result out of the three, when done properly.

>   2). The spam has a valid local part and is deliver to a user.
>       The user has two option:
> 
>       a). Ignore the spam - The Netiqutee way.

Yes, the second best outcome.

>       b). Decides that enough is enough and replay say saying "Please
>           do not spam me again" and for good measure copy the postmaster
>           at your domain your domain so you can take action against your
>           user abusing your systems.

People who naively reply to spam this way need to be educated..

> > I'm proposing not any change to the RFC.  A DSN need only be generated
> > once an email is accepted.  There is no need to accept an email that
> > is destined for a user that does not exist, that comes from a site
> > that does not exist in DNS, that you have already identified as
> > spam, or for any other reason if you don't want to.  You just issue
> > a 5xx response at RCPT or DATA phase of the SMTP conversation and
> > the connection is dropped without anyone getting a DSN.
> 
> This is basic server verification without callback is it not?

Yes, plus any of the ways you can determine that a mail is not one
that you wish to accept.

> In exim
> that's the "verify = sender" in acl_check_rcpt.  I'm doing that.  But if
> I generate 5xx response doesn't the upstream MTA gernerate a DSN for you
> because it was unable to deliver the message to me?

Yes.  But that is not your problem.  You did not generate it.
Fixing a mail server to do most of its rejections during SMTP forces
back the responsibility closer to the actual entities doing the
sending.

Most spam is sent directly by compromised hosts with no actual real
SMTP engine.  They get an error and just give up.  The authors of
such malware have no reason to implement anything that wastes their
stolen bandwidth in sending DSNs.

You're right that if a compromised machine was sending through an
ISP smarthost then the smarthost would be the one generating the
DSN.  It may be worth using CBV then, but if spammers move to doing
this then why won't they also move to using real addresses (which
bypasses CBV)?

> > There is only a limit to how far you can go though.  For example,
> > andy at lug.org.uk forwards to andy at strugglers.net which are on
> > different machines.  If mail-in-01.lug.org.uk accepts some spam for
> > andy at lug.org.uk but mail.strugglers.net decides to not accept it
> > then I am going to cause mail-in-01.lug.org.uk to generate a DSN.
> 
> Without CBV my MTA has no way of knowing if the e-mail claiming to come
> from your domain is from a valid user.  If the user isn't valid (the
> most likely if spamers are just selecting local parts at random) then
> CBV could cost you noting extra if I accept the message from up stream
> and deliver it to /dev/null on the assumpt that it is spam.  I'm happy
> to pay the cost of eating the spam if you're prepared to paid the CBV
> costs.

If it were a case of paying for your servers to connect to mine and
check that the mail is really from one of my users then fine.  The
problem is that the cost of CBV is that many servers worldwide
will connect to mine to validate emails not sent by users I don't
have.  When you get a spam, it wasn't just sent to you.  It was sent
to millions of people.  If random local parts at strugglers.ent are
used in the spam run and if use of CBV was widespread then I would
be looking at potentially millions of SMTP connections in a short
space of time.

While CBV would help the receiving sites decide that there is no
point accepting the spam, there is no excuse for doing this if it
hammers me into the ground.

That is why I say, fix teh bounces first, as much as possible, and
then start to think about limited uses of CBV.

> Problems occure when the spam's sender address is valid:

If widespread use of CBV does not kill some sites and get itself an
even worse name then its long term outcome will be to force spammers
to use real addresses..

I don't know what the answer is.  I have hopes for domain keys...

Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.lug.org.uk/pipermail/sussex/attachments/20060821/c6606e36/attachment.pgp