[Gllug] sed script for removing lone new line characters
Dan Stevens (IAmAI)
dan.stevens.iamai at gmail.com
Fri Mar 3 15:32:08 UTC 2006
Thanks for your responce. However, I think you may have misunderstood
request. By 'lone new line character', I did not mean a lone blank
line...
...like that. Think of it terms as a sequence of characters. For
example, with the example file you described, if we replaced the
literal new lines with the new line escape sequence, we would get:
Delete this line:\n\n\Don't delete these lines:\n\n\nBecause there are > 1\n
Now, if you look at the actual number of 'new line characters' (\n),
you will see that there are two after 'Delete this line:', thus it is
not alone.
Perhaps I give my own example of what I have now, and how I would like
it after processing.
$ cat before
I would like this line
and this line, on one line.
But I would like a blank line between this paragraph and the previous.
$ cat after
I would like this line and this line, on one line.
But I would like a blank line between this paragraph and the previous.
This reminds me that the lone new line character would also need to be
replaced with a space.
The reason I would like something like this is that I have a number of
large text files where paragraphs are separated by blank lines, but
the paragraphs themselves are spread over multiple lines (for word
wrapping purposes), without any blank lines inbetween (otherwise that
would donate a new paragraph). Basically, I want each individual
paragraph to be on a line of it's own.
Thanks again.
On 03/03/06, Steve Nelson <sanelson at gmail.com> wrote:
> On 3/3/06, Dan Stevens (IAmAI) <dan.stevens.iamai at gmail.com> wrote:
> > Could anyone advise me on a sed script for removing lone new line
> > characters, but not consecutive new lines of two or more?
>
> sanelson at smyslov:~$ cat test
> Delete this line:
>
> Don't delete these lines:
>
>
> Because there are > 1
>
> sanelson at smyslov:~$ sed '/^$/{N;/\n$/!D;}' < test
> Delete this line:
> Don't delete these lines:
>
>
> Because there are > 1
>
> How it works:
>
> First we're interested in new lines, so search for newlines:
>
> --> /^$/
>
> Ok, once we match a new line - ie we have "nothing" in pattern space,
> enter {}. All commands enclosed in {} will be executed one by one.
>
> Now we need to add the following line to pattern space - we do that
> with N. N will add the following line, inserting a literal \n. This
> produces line1\nline2.
>
> --> N
>
> Now, because we are matching on a blank line, which I will call
> (nothing), after we append the following line, we have two
> possibilities:
>
> 1) Pattern space contains: (nothing)\sometext
> 2) Pattern space contains (nothing)\(nothing)
>
> The second case is what happens if we have more than one consecutive blank line.
>
> Ok, so now lets check for this second case:
>
> -->/\n$/
>
> and also negate it - we are saying "If we do *not* match
> (nothing)\(nothing), take action."
>
> Now, if we do not match (nothing)\(nothing) this means that the
> following line contained text. This means that the line we matched
> was on its own.
>
> If this is the case delete it:
>
> --> D
>
> S.
> --
> Gllug mailing list - Gllug at gllug.org.uk
> http://lists.gllug.org.uk/mailman/listinfo/gllug
>
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list