[GLLUG] Grep question
Fred Youhanaie
fly at anydata.co.uk
Thu Oct 27 14:33:47 UTC 2022
Hi John
In your case it's the hyphen that's special, it is used as range indicator, e.g. [a-z] means [abcdef...z], unless it's at the end of the bracket expression
To quote from the grep man page
"...to include a literal - place it last."
Although I've always been putting the "-" at the start, which is what you have done in your second example.
Cheers,
f.
On 27/10/2022 15:02, John Levin via GLLUG wrote:
> Dear list,
>
> In cleaning up mountains of OCR'd text, I've found Grep doing something I don't undesrtand.
>
> The aim is to locate lines ending with certain punctuation marks(-—.) and spaces. But depending on the order of those punctuation marks, I get different results. With the full stop listed first, I get
> two results, one of which doesn't fit the criteria; with the stop third I get 5 lines correctly matching the criteria (and I presume, all the lines that do match).
>
> johnl at Hasek:~/github/statutes$ grep ' [.-— ]\{3,\}$' W*/mon*.txt
> and every of them are and is hereby obliged to accept, re- ...
> 1. s. </.~|
>
> johnl at Hasek:~/github/statutes$ grep ' [-—. ]\{3,\}$' W*/mon*.txt
> and every of them are and is hereby obliged to accept, re- ...
> 'fliqitors autj licences, — Aorb’s -----
> IPfipficians, -----
> II — -- — -.... - -- - - - —
> shall be ~ - - —
>
> Is there something about the order of characters in regex square brackets? Does the stop have a special meaning when given first?
>
> Thanks in advance,
>
> John
>
More information about the GLLUG
mailing list