[Gllug] Awk Sed

Wulf Forrester-Barker wulf.f-b at uhl.nhs.uk
Fri Dec 7 10:08:56 UTC 2001


Harry <postituk at yahoo.com> was asking:

> If I wanted to change the same pattern in mutiple files
> held in a directory do I use sed or awk.  I hate going
> into each file and using vi. I want the text to be modified
> in the file and not redirected to another place.

Over the past few months, I've been working on getting all of the
clinical guidelines we use at this hospital into a common XML format.
Once or twice I've updated the format (for example, a few weeks ago, I
added a review period as a compulsory attribute to one of the elements,
meaning that we can automatically show all guidelines due for review),
and that has necessitated performing simple changes on a large number of
files.

For that purpose I've developed the following system.

1. Copy the guidelines from the NT box where they are developed to the
Linux machine where I do my magic...

2. Edit the file called 'sedscript' in the directory above where the
files have been copied to (no particular reason for the name, but it's
the one you'll see below). This will contain the commands I want to run
- the last one was just:

s|<dates>|<dates reviewperiod="12">|   

but could have been a whole series of transformations.

3. Run the command 'change' ... which activates the following shell
script:

#!/bin/bash

for i in *.xml;
do
  sed -f ../sedscript $i > $i.tmp
  diff $i $i.tmp
done

This uses the commands in sedscript, creates a series of temporary
files and runs the diff command so that I can make sure that the
alterations were what I intended (ie. I've built in a safety net).

4. If I'm confident that the changes were correct, I'll then run
another script which I call 'commit':

#!/bin/bash

rm *.xml
rename .xml.tmp .xml *.tmp

I can then copy the updated files back to their home location. It's NOT
an 'edit in file, one step only' solution BUT it does mean that I can
check the effect of my script before making the change permanent. There
is a storage overhead of temporarily requiring double the space, but at
the moment all 259 guidelines only fill about 1.3Mb, so that's not an
issue on the box I'm using.

The two resources I've found most useful in getting to grips with sed
are:

The sed FAQ (http://www.dbnet.ece.ntua.gr/~george/sed/sedfaq.html) 

sed & awk (2nd edition), Dan Dougherty & Arnold Robbins, O'Reilly
(1997), ISBN: 1-56592-225-5

I'm sure that perl is able to do all of that with bells on, but I've
found sed and awk to be useful tools, which will doubtless be useful
when I finally get round to embarking on perl ;-)

Wulf



wulf.f-b at uhl.nhs.uk 

**********************************************************************
DISCLAIMER:

Any opinions expressed in this email are those of the individual and
not necessarily the Trust. This email and any files transmitted with
it are confidential and intended solely for the use of the individual
or entity to whom they are addressed. Any unauthorised disclosure of
the information contained in this e-mail is strictly prohibited.

The contents of this email may contain software viruses which could
damage your own computer system. Whilst we have taken every
reasonable precaution to minimise this risk, we cannot accept liability
for any damage which you sustain as a result of software viruses.
You should therefore carry out your own virus checks before opening
the attachment.

If you have received this email in error please notify the sender or
postmaster at uhl.nhs.uk. Please then delete this email.

University Hospital Lewisham
Tel: 020 8333 3000
Web: www.uhl.ac.uk 
**********************************************************************


-- 
Gllug mailing list  -  Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug




More information about the GLLUG mailing list