[Gllug] Find non-7-bit characters in files

Rich Walker rw at shadow.org.uk
Thu Jun 16 17:30:45 UTC 2005


Richard Jones <rich at annexia.org> writes:

> Here's a small Thursday afternoon puzzler for everyone.
>
> I hae a large number of files (HTML files in fact, not that it
> matters).  A clueless^Wevil web monkey^Wdesigner has hidden bytes in
> them that are in the range 0x80 - 0xff, so the files aren't valid
> UTF-8.
>
> I want to find those characters.  Preferably quickly from the command
> line.

grep -E [`echo -ne "\200"`-`echo -ne "\377"`] *

cheers, Rich.

-- 
rich walker         |  Shadow Robot Company | rw at shadow.org.uk
technical director     251 Liverpool Road   |
need a Hand?           London  N1 1LX       | +UK 20 7700 2487
www.shadow.org.uk/products/newhand.shtml
-- 
Gllug mailing list  -  Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug




More information about the GLLUG mailing list