[Gllug] Find non-7-bit characters in files

Richard Jones rich at annexia.org
Thu Jun 16 17:02:44 UTC 2005


Here's a small Thursday afternoon puzzler for everyone.

I hae a large number of files (HTML files in fact, not that it
matters).  A clueless^Wevil web monkey^Wdesigner has hidden bytes in
them that are in the range 0x80 - 0xff, so the files aren't valid
UTF-8.

I want to find those characters.  Preferably quickly from the command
line.

I tried various combinations of egrep with the [:print:] character,
but to no avail.

Help!

Rich.

-- 
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list