[Gllug] Lists of bad IPs?
Richard Jones
rich at annexia.org
Mon Sep 15 09:34:37 UTC 2008
On Mon, Sep 15, 2008 at 12:04:23AM +0100, Alistair Mann wrote:
> Richard Jones wrote:
> > So ... What is the current state of databases of "bad" IPs? I'm aware
> > of DenyHosts but they seem to concentrate on port 22, and in any case
> > I'm suspicious of their policies for adding IPs and you can't just
> > download the list.
>
> Which would seem to make sense -- Alice would know instantly were one
> her zombies on the list, and instruct it to forgoe attacks. When the
> zombie came off the list, the attacks could then resume. This would give
> her zombie an effective operational window the size and period of the IP
> list update cycle. By not making the list downloadable, the zombie
> cannot help but constantly reveal its presence, and thus give it no
> operational period other than its first.
It still means the zombie (or rather the IP) is globally unusable for
the whole period, thus incurring a significant extra cost on the
spammer. And an algorithm which isn't completely naive will remember
times when an IP was blocked, and exponentially increase the block
time each addition time it is blacklisted. I don't see how
downloading the blacklist makes this any easier for a spammer.
Anyway ...
My initial attempt at modelling this wasn't successful. I assumed
that you would log both the reported IP address and the reporter's IP
address and use this to build a reputation model of IP addresses. So
if 2.* are "good" IPs and 9.* are bad IPs you'd get something like
this:
date reporter IP reported IP weight
-------------------------------------------------
13/9 2.1.1.1 9.1.1.1 -1
13/9 2.1.1.1 9.1.1.2 -1
13/9 2.1.1.1 9.1.1.3 -1
13/9 2.1.1.1 9.1.1.4 -1
14/9 9.1.1.1 2.1.1.1 -1000
14/9 2.1.1.1 9.1.1.1 -1
14/9 2.1.1.1 9.1.1.4 -1
14/9 2.1.1.1 9.1.1.4 -1
14/9 2.1.1.1 2.1.1.5 +1
Initially I used the PageRank algorithm, modified with weights and a
seed of trusted IPs, to try to build a reputation model, but basically
I cannot get very meaningful numbers out of the model for all the
datasets I've experimented with.
So I'm researching if there are any other more suitable models.
There's not a huge amount of literature about this -- about 5 good
papers that are available online.
I'll put something on my blog about this if it comes to anything. For
now I'm about to ban the word "cialis" from my site, which seems like
it will be effective.
Rich.
--
Richard Jones
Red Hat
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list