[GLLUG] Grep question

Fred Youhanaie fly at anydata.co.uk
Thu Oct 27 14:33:47 UTC 2022


Hi John

In your case it's the hyphen that's special, it is used as range indicator, e.g. [a-z] means [abcdef...z], unless it's at the end of the bracket expression

To quote from the grep man page

"...to include a literal - place it last."

Although I've always been putting the "-" at the start, which is what you have done in your second example.

Cheers,
f.


On 27/10/2022 15:02, John Levin via GLLUG wrote:
> Dear list,
> 
> In cleaning up mountains of OCR'd text, I've found Grep doing something I don't undesrtand.
> 
> The aim is to locate lines ending with certain punctuation marks(-—.) and spaces. But depending on the order of those punctuation marks, I get different results. With the full stop listed first, I get 
> two results, one of which doesn't fit the criteria; with the stop third I get 5 lines correctly matching the criteria (and I presume, all the lines that do match).
> 
> johnl at Hasek:~/github/statutes$ grep ' [.-— ]\{3,\}$' W*/mon*.txt
> and every of them are and is hereby obliged to accept, re- ...
> 1. s. </.~|
> 
> johnl at Hasek:~/github/statutes$ grep ' [-—. ]\{3,\}$' W*/mon*.txt
> and every of them are and is hereby obliged to accept, re- ...
> 'fliqitors autj licences, — Aorb’s -----
> IPfipficians, -----
> II — -- — -.... - -- - - - —
> shall be ~ - - —
> 
> Is there something about the order of characters in regex square brackets? Does the stop have a special meaning when given first?
> 
> Thanks in advance,
> 
> John
> 



More information about the GLLUG mailing list