[Glastonbury] html wc

Alan Pope alan.pope at gmail.com
Sat Jul 2 00:39:40 BST 2005


On 02/07/05, Henry Bennett <henry at hbennett.com> wrote:
> Ian,
> 
> I know this will likely piss you off at not being a linux thing but MS Word

*Mircosoft* *Word*!? (plus of course a massive proprietary operating
system to run it under.. How many meg of RAM and disk space, to count
the words in a file?

Deary dear, that'll never do. :)

How about dumping the html through lynx (the text mode browser) and
counting the results.

alan at wopr:~ $  lynx --dump --nolist http://popey.com/blog/ | wc
    138     520    3813

Yup, that appears to work. The nolist parameter prevents it listing
URLs at the bottom of the output. The man page for wc (word count)
tells us that those numbers are "the number of newlines, words, and
bytes in files".

That help? Word indeed.. tsk! ;)

Cheers,
Al.



More information about the Glastonbury mailing list