[Nottingham] AWK script

stripes stripes at roppa.demon.co.uk
Fri Mar 25 09:35:30 UTC 2011

Hi Cam,

> Ah I see, that's down to the format I suppose. I think if I was doing
> it this way (with awk) and passing in the name anyway, I would
> probably not bother printing the name found. In fact you could have
> two scripts, one that prints a list of names only and another that
> prints all the titles for one name...

I have two separate scripts I was just trying combine them into a
single script basically to learn more about AWK.

To extract names I have this

awk '{RS=""};{FS="\n"};{for(i=1;i<NF;i++){if($i ~ /Author-Name:/)
print  $i}}' wpaper.rdf | sed 's/Author-Name: //g' | sort -u >

Finds Author-Name: in each record, Strips the Author-Name: and
trailing space, sorts into alphabetical order and strips duplicates
and dumps it all in a text file.

To find the titles I have this

awk '{RS=""};{FS="\n"};{for(i=1;i<NF;i++){if($i ~
name)}}{for(i=1;i<NF;i++){if($i ~ name){for(i=1;i<NF;i++){if($i ~
/Title:/){print $i }}}}}' name=Harry.R.Clarke wpaper.rdf | sed
's/Title: //g' > Harry.R.Clarke.txt

Finds Harry.R.Clarke in each record extracts the Title line from the
record strips title and outputs it to a file.

I wonder if I can use the sorted-names.txt as input to the second
script to extract every title for every author.
That will give me something to puzzle about for the weekend.


More information about the Nottingham mailing list