[SWLUG] regexp and sed
Dave Cridland [Home]
dave at cridland.net
Tue Dec 24 14:40:07 UTC 2002
On Tue, 24 Dec 2002 09:41:56 +0000
bascule <asura at theexcession.co.uk> wrote:
> .*\(\..*\)
> could be described as:
> a sequence of any charcters followed by one period followed by a sequence of
> any characters - some of which may be periods,
> it looks to me like each filename can be split in three different ways and
> still match this description, so - at last the question - where in the
I think that means "A sequence of between zero and infinity characters, followed by a grouping of a period followed by a sequence of between zero and infinity characters."
Which is not the same.
> regexp, or in sed, is the logic that determines that the part:
> \(\..*\)
> only matches the last period and what follows and not any of the other periods
> and what follows?
The initial .* is greedy, andwill "eat" as much of the string as possible, leaving the minimal amount for the group, which should then match only the extension, and the preceeding period.
FWIW, I would do:
for DWDI in *; do
DWDMT=`stat -c '%y' $DWDI`
# Pull out the modtime in whatever format stat likes.
DWDFMT=`date --date "$DWDMT" +'%Y-%m-%d_%H.%M.%S'`
# Turn it into our format, using date to do the hard work.
DWDEXT=`echo $DWDI | sed -e 's/^.*\.\([^.]*\)$/\1/'`
# Extract extension
# - use entire filename if there's no extension.
mv $DWDI $DWDFMT.$DWDEXT
# Actually do the rename.
done
Not because it's any better, but because it strikes me that I might understand it if I looked at it after 6 months.
The sed there is effectively looking for a string which ends with a period followed by some stuff which isn't periods. We then swap the string for the stuff which isn't periods, which is hopefully the extension. A file called, say "viruses", though, ends up being treated as if it had an extension of "viruses".
Dave.
More information about the Swlug
mailing list