[Nottingham] AWK script
stripes
stripes at roppa.demon.co.uk
Fri Mar 25 09:35:30 UTC 2011
Hi Cam,
> Ah I see, that's down to the format I suppose. I think if I was doing
> it this way (with awk) and passing in the name anyway, I would
> probably not bother printing the name found. In fact you could have
> two scripts, one that prints a list of names only and another that
> prints all the titles for one name...
I have two separate scripts I was just trying combine them into a
single script basically to learn more about AWK.
To extract names I have this
awk '{RS=""};{FS="\n"};{for(i=1;i<NF;i++){if($i ~ /Author-Name:/)
print $i}}' wpaper.rdf | sed 's/Author-Name: //g' | sort -u >
sorted-names.txt
Finds Author-Name: in each record, Strips the Author-Name: and
trailing space, sorts into alphabetical order and strips duplicates
and dumps it all in a text file.
To find the titles I have this
awk '{RS=""};{FS="\n"};{for(i=1;i<NF;i++){if($i ~
name)}}{for(i=1;i<NF;i++){if($i ~ name){for(i=1;i<NF;i++){if($i ~
/Title:/){print $i }}}}}' name=Harry.R.Clarke wpaper.rdf | sed
's/Title: //g' > Harry.R.Clarke.txt
Finds Harry.R.Clarke in each record extracts the Title line from the
record strips title and outputs it to a file.
I wonder if I can use the sorted-names.txt as input to the second
script to extract every title for every author.
That will give me something to puzzle about for the weekend.
Stripes.
More information about the Nottingham
mailing list