[Gllug] Recommend an ADSL modem?

Nix nix at esperi.org.uk
Tue Feb 10 01:06:09 UTC 2004


On Wed, 4 Feb 2004, dylan at dylan.me.uk uttered the following:
> On Wednesday 04 February 2004 20:51 pm, Nix wrote:
>> Strange. Maybe there's a way to change the 1 minute lease time, but
>> I don't know what it is.
> 
> I think I was being too optimistic - the lease does drop, just not as 
> often as it did before (about 18 to 48 hours as opposed to 2 hours.)

Oh well :(

> I've found a simple rcnetwork restart often does the trick, so now I'm 
> going to get round to a cron job to if-not-ping-then-rcnetwork-restart

That's roughly what my script does, only it's, er, more excessive. :)

>> (What does packet snooping tell you about your leases?)
> 
> How can I get that info?

I used Ethereal and filtered for DHCP packets: it's easier than picking
the DHCP packets apart by hand. :)

>> I have a re-bring-the-line-up perl script if you're interested (that
>> reboots the router to bring the line up again, 'cos mine seems
>> incapable of bringing its PPP connection up successfully more than
>> once per boot).
> 
> Yes, it's another stupid thing about the device - when I change ISP I'll 
> be changing router, but until then I can't afford it.

There's no need: in half-bridging mode with a suitable script it works
fine. I'm not planning on switching.

(Who needs intelligence in a simple PPPoA <-> Ethernet device anyway? :) )

>                                                       Your script would 
> be great, but I might need a nudge about getting it working...

Here y'are, one bloody hideous script: I had to write it as Zetnet drops
the lin every twelve hours, and `always on as long as you're by the
machine to reboot the router every half day' != `always on'. :)

The code is fairly well explained in the comments: the customizable
stuff is in the constants section at the top. Things you will want to
change are DEVICE, GATEWAY (your upstream gateway), IP_ADDRESS (your IP
address; I assume static, you'll need more hacking if it isn't),
DOWNED_IP_ADDRESS (the IP allocated by the router if we try to connect
before the line is fully up), the hostname in REBOOT_PAGE and MAIN_PAGE,
the GATEWAY_HOST, and probably the DHCP_PID_FILE and/or DHCP_STATE_FILE.

Plus flipping DEBUG on stops it daemonizing and spits some info out
about WTF it's doing.

Finally, the password in get_basic_credentials needs changing.

It assumes you have Alexey Kuzenetsov's `ip' tool, which makes
`ifconfig' and `route' look like the utter crap they are.

It does a double-exponential-backoff: when the line goes down (such that
several pings to the gateway fail), bring the line up, backing off the
time-waited-for-the-line-to-come-up if we didn't allow long enough for
the line to come up, backing off the attempts to bring the line up if
the line refuses to come up. (It never stops trying, but eventually if
the line stays down it tries only every few hours.)


It probably makes sense to use something like the Wondershaper to
prioritize pings quite high: we don't want this to drop the line simply
because it is saturated. (I do that, and to my knowledge the only time
the line's ever dropped because of problems at my end are because the
entire virtual machine this ran in ran out of memory and rebooted.)


One improvement I want to make is autodetection of the DOWNED_IP_ADDRESS
(dammit, if the DHCP client assigns an insane IP address, we should spot
it in the `ip addr' output and zap it ourselves; that output is
easily perl-parseable).


#!/usr/bin/perl -w
#
# keep-firewall-up --- Check for line drops nad do the horrible dancing
#                      necessary to keep the firewall up when such drops
#                      happen.
#

use strict;
use warnings;
use IPC::Run qw (run);
use Net::Ping;
use POSIX 'setsid';
use LWP::UserAgent;
use constant
  {
    # Asleep: every SLEEP_TIME, wake up and ping the GATEWAY on the DEVICE.
    # If that fails FAIL_COUNT times (repinging every FAIL_RETRY seconds),
    # go to the next state.
    STATE_ASLEEP => 0,
    # Reboot the gateway (by pulling down the REBOOT_PAGE from the router)
    # and wait for a time bounded below by REBOOT_TIME_MIN and above by
    # REBOOT_TIME_MAX; a successful connection lowers the reboot time and an
    # unsuccessful one raises it. First, though, we kill dhclient to
    # ensure that we don't get a lease negotation while the PPP link is
    # down and the router is unsure of our ADSL-side IP address.)
    # After that, we tear down the IP_ADDRESS and restart dhclient
    # (pointing it at the DEVICE). After that, we check to see if the
    # DEVICE is actually connected by pulling down the MAIN_PAGE from
    # the ROUTER and grepping it for the LINE_UP_INDICATION. If the
    # link was down after all, we zap the erroneous IP address the router
    # assigned us (the DOWNED_IP_ADDRESS), and go to STATE_DOWN.
    # Otherwise, everything should be fine; we return to STATE_ASLEEP.
    STATE_WAIT_REBOOT => 1,
    # If the router reboots and the line is still dead, we go into STATE_DOWN;
    # we wait for the DOWN_TIME, which starts at one minute and maxes out at
    # three hours, double the reboot-time, and switch to STATE_WAIT_REBOOT.
    STATE_DOWN => 2,
    # If anything we can't recover from fails (e.g. the router refuses to
    # reboot) we go to STATE_DEAD, in which we simply ping the router until
    # a human brings us up again.
    STATE_DEAD => 3,

    SLEEP_TIME => 60,
    REBOOT_TIME_MIN => 40,
    REBOOT_TIME_MAX => 300,
    REBOOT_TIME_UP_MULTIPLIER => sub { return $_[0] * 1.5; },
    REBOOT_TIME_DOWN_MULTIPLIER => sub { return $_[0] * 0.9; },
    REBOOT_TIME_INIT => 60,
    DOWN_TIME_INIT => 60,
    DOWN_TIME_MAX => 10800,
    DOWN_TIME_MULTIPLIER => sub { return $_[0] * 1.3; },
    FAIL_COUNT => 2,
    FAIL_RETRY => 5,
    DEVICE => 'adsl',
    GATEWAY => '217.32.14.129',
    IP_ADDRESS => '194.247.41.52',
    DOWNED_IP_ADDRESS => '192.168.14.161',
    REBOOT_PAGE => 'http://gateway:15022/doc/doreboot.htm?force_reboot=1',
    MAIN_PAGE => 'http://gateway:15022/doc/home.htm',
    GATEWAY_HOST => 'gateway:15022',
    LINE_UP_INDICATION => 'IpAddress_Value-1',
    DHCP_PID_FILE => '/var/run/dhclient.pid',
    DHCP_STATE_FILE => '/var/state/dhcp/dhclient.leases',
    DEBUG => 0
  };

# Print a debugging message when DEBUG.
sub debug
  {
    my ($message) = @_;

    print STDERR $message . "\n" if DEBUG;
  }

# Become a daemon.
sub daemonize
  {
    chdir '/' or die "Can't chdir to /: $!";
    open STDIN, '/dev/null' or die "Can't read /dev/null: $!";
    open STDOUT, '>/dev/null' or die "Can't write to /dev/null: $!";
    defined(my $pid = fork) or die "Can't fork: $!";
    exit if $pid;
    setsid or die "Can't start a new session: $!";
    open STDERR, '>&STDOUT' or die "Can't dup stdout: $!";
  }

# An LWP::UserAgent specialization that always uses specific
# credentials.
{
  package FixedAgent;
  @FixedAgent::ISA = qw(LWP::UserAgent);
  use vars qw($adminuser $adminpwd);

  sub new
    {
      my $self = LWP::UserAgent::new(@_);
      $self->agent("autorebooter 1.1");
      $self;
    }

  sub get_basic_credentials
    {
      return ('admin', 'blahblah');
    }
}

daemonize() unless DEBUG;

my $state = STATE_ASLEEP;
my $just_connected = 0;
my $fail_count = 0;
my $pinger = Net::Ping->new ('icmp', 2, 0, DEVICE);
my $sleep_time = SLEEP_TIME;
my $reboot_time = REBOOT_TIME_MIN;
my $down_time = DOWN_TIME_INIT;
my $ua = new FixedAgent;

# Keep checking for ever.
PING: for (;;)
  {
    # State machine.

    # Asleep, waiting for something to go wrong.
    if ($state == STATE_ASLEEP)
      {
        # Every minute, try to ping.
        # If it fails, remember that and retry in a short time; go to
        # STATE_WAIT_REBOOT eventually.
        # If we've only just connected and can't ping through, go to STATE_DOWN.

        debug ('STATE_ASLEEP: Testing liveness');

        if (!$pinger->ping (GATEWAY))
          {
            if ($fail_count > FAIL_COUNT)
              {
                $sleep_time = SLEEP_TIME;
                $fail_count = 0;

                if (!$just_connected)
                  {
                    debug ('STATE_ASLEEP: Line down; rebooting');
                    $state = STATE_WAIT_REBOOT;
                    next PING;
                  }
                else
                  {
                    debug ('STATE_ASLEEP: Line still down; stalling');
                    $state = STATE_DOWN;
                    next PING;
                  }
              }
            else
              {
                debug ('STATE_ASLEEP: Line may be down; retrying');
                $sleep_time = FAIL_RETRY;
                $fail_count++;
              }
          }
        else
          {
            $sleep_time = SLEEP_TIME;
            $down_time = DOWN_TIME_INIT;
            $just_connected = 0;
            $fail_count = 0;
            debug ('STATE_ASLEEP: Still alive, sleeping');
          }
        sleep ($sleep_time);
      }
    # We've lost contact with the GATEWAY. Reboot the gateway
    # and pull esperi's routes up and down correctly.
    elsif ($state == STATE_WAIT_REBOOT)
      {
        # First, kill dhclient if it seems to be runnning. Before the router
        # gets its line up again it will be advertising the wrong IP address,
        # and we don't want dhclient to pick that IP address up.

        debug ('STATE_WAIT_REBOOT: Killing dhclient and unlinking state file');
        if (-f DHCP_PID_FILE)
          {
            open DHCP_PID, DHCP_PID_FILE;
            my $pid = <DHCP_PID>;
            close DHCP_PID;
            kill 15, $pid;
            unlink DHCP_STATE_FILE;
          }

        # dhclient is dead and lease negotation stopped. Tell the router to
        # reboot. (If this times out, go to STATE_DEAD pending the writing of a
        # Robotics::Arm::Control::FlipSwitch module.)

        debug ('STATE_WAIT_REBOOT: Rebooting router');
        my $response = $ua->get (REBOOT_PAGE, [Host => GATEWAY_HOST]);
        if ($response->is_error())
          {
            $state = STATE_DEAD;
            next PING;
          }

        # Wait for the reboot and line reconnection to finish.

        debug ("STATE_WAIT_REBOOT: Waiting for reboot for $reboot_time seconds");
        sleep ($reboot_time);

        # Tear down our IP address and start dhclient (which will set it up
        # again, yuck.)
        # This *really* doesn't work if we have dynamic IP, but we don't.
        # There's a race window here when we have no default route; this should
        # I guess be fixed but is vastly dominated by the roughly 1.5-minute
        # window when the link is down because the line's been dropped!

        debug ('STATE_WAIT_REBOOT: Deleting old IP address and default route');
        run (['ip','addr','del',IP_ADDRESS,'dev',DEVICE]);
        debug ('STATE_WAIT_REBOOT: Running dhclient');
        run (['dhclient','-q',DEVICE]);

        # If the line wasn't really up, we'll have been given a downright stupid
        # IP address (DOWNED_IP_ADDRESS) and route (DOWNED_ROUTE).
        # In that case, kill that IP address and route and recreate it.

        # Checking to see that we are up is horrible.
        # We do it by (ugh, yuck, bleah) grepping the router's main page for the
        # LINE_UP_INDICATION, which is by default a label attached to the first
        # line of our WAN address table (and is not present if the line is down,
        # touch wood).
        # This is, admittedly, disgusting, but this router is too primitive to
        # tell us its state in a sensible fashion. (And if it were smarter it
        # would probably use SOAP, yuck. I think I prefer this method to
        # SOAP. ;) )

        # Note that while it would be conceptually neater to check the line up
        # state *before* the router has assigned us a garbage IP address, we
        # can't do that, because before the router's assigned an IP address to a
        # DHCP client in half-bridge mode, it is unresponsive to all but DHCP
        # negotiations. (What a crappy design.)

        # If the line is in fact down we raise the reboot_time and go to
        # STATE_DOWN; otherwise, we drop the reboot_time.

        debug ('STATE_WAIT_REBOOT: Verifying lineupdom');
        $response = $ua->get (MAIN_PAGE);

        my ($reboot_fun, $provisional_reboot_time);

        if (($response->is_error()) ||
            ($response->content !~ LINE_UP_INDICATION))
          {
            debug ('STATE_WAIT_REBOOT: Line is in STATE_DOWN: deleting routes');

            run (['ip','addr','del',DOWNED_IP_ADDRESS,'dev',DEVICE]);

            $reboot_fun = REBOOT_TIME_UP_MULTIPLIER;
            $provisional_reboot_time = &$reboot_fun ($reboot_time);
            if ($provisional_reboot_time <= REBOOT_TIME_MAX)
              {
                $reboot_time = $provisional_reboot_time;
              }
            else
              {
                $reboot_time = REBOOT_TIME_MAX;
              }

            $state = STATE_DOWN;
            next PING;
          }

        $reboot_fun = REBOOT_TIME_DOWN_MULTIPLIER;
        $provisional_reboot_time = &$reboot_fun ($reboot_time);
        if ($provisional_reboot_time >= REBOOT_TIME_MIN)
          {
            $reboot_time = $provisional_reboot_time;
          }
        else
          {
            $reboot_time = REBOOT_TIME_MIN;
          }

        # All done; go to sleep.

        $just_connected = 1;
        $state = STATE_ASLEEP;
      }
    # Well, we reconnected and the line is still dead.
    # Wait for the down_time and raise it by the DOWN_TIME_MULTIPLIER iff
    # that wouldn't put it above the DOWN_TIME_MAX. Then go back to
    # STATE_WAIT_REBOOT.
    elsif ($state == STATE_DOWN)
      {
        debug ("STATE_DOWN: Sleeping for $down_time seconds");
        sleep ($down_time);

        my $down_time_fun = DOWN_TIME_MULTIPLIER;
        my $provisional_down_time = &$down_time_fun ($down_time);
        if ($provisional_down_time <= DOWN_TIME_MAX)
          {
            $down_time = $provisional_down_time;
          }
          else
          {
            $down_time = DOWN_TIME_MAX;
          }

        debug ('STATE_DOWN: Trying to reboot once more');

        $state = STATE_WAIT_REBOOT;
      }
    # Something is wrong: ping until the line comes up (indicating that
    # a human has fixed it) and then go back to normal.
    elsif ($state == STATE_DEAD)
      {
        debug ('STATE_DEAD: Pinging');
        sleep ($sleep_time);

        if ($pinger->ping (GATEWAY))
          {
            debug ('STATE_DEAD: Line up once more. Waking up');
            $state = STATE_ASLEEP;
          }
      }
  }



-- 
`note to the crown prosecution service: Machine guns dont have a
 'stun' setting.' --- mjw
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list