[Gllug] Stripping whitespace in bash script

Sat Aug 12 02:13:06 UTC 2006

Alex Sayle writes:

>sed is your friend.

A statement with which I wholeheartedly agree. Eventually, I'll
even get around to putting the slides from my recent sed talk up
on the web...

However, for something this simple, you don't need to spawn a new
process. Just strip the trailing spaces off directly in the shell:

	leto:~% word="some trailing spaces     "
	leto:~% echo ":$word:"
	:some trailing spaces     :
	leto:~% echo ":${word//%  */}:"
	:some trailing spaces:

However, this trick falls down in two cases:

1. When the string contains only a single trailing space.
2. When the string contains multiple consecutive spaces that aren't
   at the end of the string.

To get around these limitations, we need to be a bit more clever.
Fortunately, bash supports regular expressions, which gives us
convenient access to regexp substring matches via the BASH_REMATCH
builtin variable:

	leto:~% word="some trailing    spaces     "
	leto:~% echo ":${word//%  */}:"
	:some trailing:
	leto:~% [[ "$word" =~ '.*[^ ]' ]]
	leto:~% echo ":$BASH_REMATCH:"
	:some trailing    spaces:

Admittedly, it's starting to look a bit like perl line noise at
this point, but it's actually fairly straitforward.

>$echo "word   " | sed -e 's/ *$//g' | sed -e 's/\(.*\)/[\1]/'
>[word]

Even if you wanted to use sed, to do this, there's no need to invoke
it twice. Just pass two expressions to a single instance of sed:

	leto:~% echo "word   " | sed -e 's/ *$//g' -e 's/\(.*\)/[\1]/'
	[word]

But the two expressions are unnecessary anyway. You could have done
it with just one:

	leto:~% echo "word   " | sed 's/\(.*[^ ]\) *$/[\1]/'
	[word]

Tet
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug