[Nottingham] Slow bash-ings for file line search

Martin martin at ml1.co.uk
Tue Apr 5 22:35:02 BST 2005


Martin Garton wrote:
> 
> What's wrong with sed or awk for this? (or even perl)
> 
> Are you doing something more complicated than e.g. sed can do once the
> lines are identified?

Yes, just slightly.

This started as a very quick hack to add tags into some MS-generated 
html mess where the "name=" is to be inserted a few lines _before_ a 
change of day is noticed later in the text.


Thus far:

#!/bin/bash

exec 9<$1

s1='<p class="BookNewsEventNames">'
s2='<p class="MsoNormal">Date:'
id=0
ff='/tmp/mlfifo'

mkfifo $ff

grep "$s2" "$1" >$ff &

exec 8<$ff

{
while read ll
do

         ( echo "$ll" | grep "$s1" >/dev/null ) && {

         if [ "$id" -ge "0" ]
         then

                 read dd <&8

                 # Ensure only up to 'day' is seen!
                 dd="${dd%day*}"

                 [ "X$dd" == "X$dp" ] || {

                         dp="$dd"
                         (( ++id ))
                         echo -n "<a name=\"$id\"> "
                 }
         else
                 (( ++id ))
         fi

         }

         echo "$ll"
done

} <&9

rm $ff


It works, but oh so slowly for the thousands of lines of html output by 
MS (:-((

I'm guessing that the slowest bit is the process starts for:
( echo "$ll" | grep "$s1" >/dev/null )
for every line.


Ideas?
(Other than converting to C)

Cheers,
Martin

-- 
----------------
Martin Lomas
martin at ml1.co.uk
----------------



More information about the Nottingham mailing list