[Glastonbury] html wc
Andrew M.A. Cater
amacater at galactic.demon.co.uk
Sat Jul 2 12:55:34 BST 2005
On Sat, Jul 02, 2005 at 12:42:59PM +0100, Ian Dickinson wrote:
> Hi Al,
> > How about dumping the html through lynx (the text mode browser) and
> > counting the results.
> That's a good suggestion!
>
> > alan at wopr:~ $ lynx --dump --nolist http://popey.com/blog/ | wc
> > 138 520 3813
> >
> > Yup, that appears to work.
> Concur.
>
> > That help? Word indeed.. tsk! ;)
> Yes, thanks. That handles the html nicely.
>
> Any other suggestions out there? What I'm actually trying to do is wc
> a docbook document. So I can process the docbook to html, then use
> Al's method to wc that. That gives me a working solution, but I'm
> always interested to know if there are other tricks I'm missing.
>
> Thanks,
> Ian
>
The O'Reilly reqular expressions book has lots of this sort of thing: I'm
fairly sure he's got an advanced word count/word match - but you a.)
need to have the book and b.) be prepared to type in the Perl. For
anyone doing any sort of pattern matching this book is a must IMHO.
Andy
> _______________________________________________
> Glastonbury Linux User Group mailing list
> Glastonbury at mailman.lug.org.uk
> http://mailman.lug.org.uk/mailman/listinfo/glastonbury
>
> User group website: http://www.lugog.org.uk/
More information about the Glastonbury
mailing list