[Gllug] Simple Bash Script

Philip Hands phil at hands.com
Tue Jun 3 10:04:38 UTC 2008


On Tue, Jun 03, 2008 at 08:49:45AM +0100, John Edwards wrote:
> On Mon, Jun 02, 2008 at 10:41:16AM +0100, william pink wrote:
> > Hello All,
> > 
> > I have a script that checks for system load then sends a email if it is over
> > a certain threshold ( Its a cut down version from the one on Nixcraft )
> > Here is the script
> > 
> > 
> > #!/bin/bash
> > 
> > # Script to check system load and remove from cluster if higher than 6.0
> > 
> > TEMPFILE="$(mktemp)"
> > NOTIFY="6.0"
> > HOST=`hostname`

I'd suggest replacing all ``s with $() for readability (and sanity if
you ever have to nest them):

  HOST=$(hostname)

Although, given that it only gets used once, and not every time through
the script, I'd generally not bother, and instead replace the $HOST with
$(hostname) in the mail command

> > RCLUSTER=`/usr/local/bin/in-cluster no`
> > #load at 1 minute
> > TRUE="1"
> > LOAD=`uptime | sed -e "s/.*load average: \(.*\...\), \(.*\...\),
> > \(.*\...\)/\1/" -e "s/ //g"`

That's mostly redundant:

  LOAD=$(uptime | sed -e "s/.*load average: \([0-9.]*\),.*/\1/")

> > 
> > #input in tempfile
> > echo "Load average to high removed from cluster, Load currently at $LOAD
> > allowed limit $NOTIFY." > $TEMPFILE

I agree with the anti-TMPFILE comments, see below -- also, that should
be "too"

> > 
> > # Compare results to benchmark
> > RESULT=$(echo "$LOAD > $NOTIFY" | bc)

If it's OK to assume bash, then we can use bash's built in ((maths))
ability.  Of course, bash only deals with integers, but that can either
be dealt with by using the integer portion of the load:

  INTLOAD=$(uptime | sed -e "s/.*load average: \([0-9]*\)\..*/\1/")

or use the 100/ths as units:

  LOADx100=$(uptime | sed -e "s/.*load average: \([0-9.]*\),.*/\1/; s/\.//")

> > 
> > # IF so do your thing
> > 
> > if [ "$RESULT == $TRUE" ]; then

then instead of that, use:

  if (($INTLOAD > $INTNOTIFY)); then

or

  if (($LOADx100 > $NOTIFY * 100)); then

or, if you're wanting to preserve the detail, then you could compare
just the integer portion of the LOAD thus:

  
  LOAD=$(uptime | sed -e "s/.*load average: \([0-9.]*\),.*/\1/")
  INTLOAD=$(echo $LOAD | sed -e 's/\..*//')
  ...
  if (($INTLOAD > $INTNOTIFY)); then
  ...
     mail ... $LOAD

or similarly, strip out the decimal point and use the LOADx100 approach
(which of course is vulnerable to uptime's number of decimal places
changing at some point)


> >         $RCLUSTER

Is that supposed to be a delayed invocation of:

  /usr/local/bin/in-cluster no

If so, I don't think it is -- that looks to me like the
/usr/local/bin/in-cluster no will have run at the time of assignment,
and this will be trying to execute the STDOUT as though it was a
command which seems unlikely to be what you were after.

I'd have thought you should just replace the $RCLUSTER with

  /usr/local/bin/in-cluster no

assuming that that's the command that's meant to take a node out of the
cluster.

> >         mail -s "$HOST has been removed from the cluster" "$EMAIL" <
> > $TEMPFILE
> > fi
> > 
> > rm -f $TEMPFILE

If you ever do need to create tempfiles like this (which in this case
you don't but ...), I'd suggest setting up a trap before you create the
file:

  trap "rm -f $TEMPFILE" EXIT

so that if the script fails somewhere in the middle, you won't end up
with an infinite number of tempfiles.

> 
> Hi
> 
> I see that others have helped you with the if statements and other
> things, but I would like to ask about if the TEMPFILE is really
> neccesary.
> 
> You should be able to send an email by echo'ing the contents to
> mail (or mutt), eg something like:
> 
> echo "Load average to high removed from cluster
> Load currently at $LOAD
> allowed limit $NOTIFY" | mail -s "$HOST has been removed from the cluster" "$EMAIL"
> 
> 
> Getting rid of tempfiles is a little personnal mission of mine after
> seeing thousands of them generated on some systems. Not only are they
> messy if the script ends halfways through, but can also be a security
> risk if the name is not suitably random or the contents are not
> sanitised.

Quite right, although for multi-line stuff like this, I tend to think
that Here Documents are less fragile, so would suggest this:

mail -s "$HOST has been removed from the cluster" "$EMAIL" <<!EOF!
Load average too high removed from cluster
Load currently at $LOAD
allowed limit $NOTIFY
!EOF!


The string after << can be anything, and indicates the string that
terminates the section that will be fed to the command as STDIN

I settled on using ``!EOF!'' for that years ago, and have never found
the need to vary that, as it's not something that crops up in normal
use.

If you're into indentation for readbility, you can prefix the terminator
with a -, and then the input will have leading TABs removed:

mail -s "$HOST has been removed from the cluster" "$EMAIL" <<-!EOF!
		Load average too high removed from cluster
		Load currently at $LOAD
		allowed limit $NOTIFY
	!EOF!

> Apart from that the other thing that might be cleaned up is the maths
> evaluations. Rather than defining $TRUE and using bc to generate
> $RESULT, I wonder if it's possible to use something like:
> 
> if [ $LOAD -gt $NOTIFY ]

That won't work with the decimal numbers in the original script, but
would be fine with either the integer or x100 approaches above, and is
probably more portable than the ((...)) approach.

Cheers, Phil.
-- 
|)|  Philip Hands [+44 (0)20 8530 9560]    http://www.hands.com/
|-|  HANDS.COM Ltd.                    http://www.uk.debian.org/
|(|  10 Onslow Gardens, South Woodford, London  E18 1NE  ENGLAND
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 196 bytes
Desc: Digital signature
URL: <http://mailman.lug.org.uk/pipermail/gllug/attachments/20080603/7f300ecc/attachment.pgp>
-------------- next part --------------
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug


More information about the GLLUG mailing list