[Sussex] Updated Grep, Sed and RegExp links from August moot Re-Updated

Fay Zee sussex at eglug.org.uk
Fri Sep 16 00:18:02 UTC 2011


Thanks to Paul for getting back to me with an observation he made
concerning the file formats on offer from the Internet Archive.

I had luckily avoided this issue myself, but I have updated the
instructions (marked in red) on my analysis page so others won't fall
into the same pit.

http://www.eglug.org.uk/bash_and_regexp_example_analysis.html

[Update 16/09/2011] Please note:  Format options include "PDF", "PDF
with text" and "Full Text". I chose "PDF". I then selected all text
and  copied and pasted into a text file. This gives a different
result to the same operation performed on "PDF with text". Please be
aware that the latter option pastes portions of sentences out of
order. The other point to note is that choosing "Full Text" makes it
harder to craft a command to identify and remove page numbers.

 Best Regards,
Fay
East Grinstead Linux User Group
www.eglug.org.uk


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.lug.org.uk/pipermail/sussex/attachments/20110916/8c912105/attachment.htm>


More information about the Sussex mailing list