[Gllug] unicode and cross site scripting vulnerabilities

Simon Stewart sms at lateral.net
Tue Feb 26 13:59:07 UTC 2002


On Tue, Feb 26, 2002 at 01:07:21PM +0000, Sean Burlington wrote:
> Hi All,
>    I would need to make some dynamic web sites more suitable for 
> internationalisation ...
> 
> but I also want to make sure that they are safe from cross site scripting 
> vulnerabilities ...
> 
> one way I sometimes make data safe is to replace or delete all chars except 
> say a-zA-Z0-9 
> 
> this means that I can be really sure that no awkward chars like quotes or <> 
> will hang around to break things.
> 
> As I understand it unicode complicates this situation in two ways...
> 
> 1) chars like 'the chinese charecter for water' should be allowed 
> 2) there are several fifferent ways of specifying (say) the quote char
> 
> So. How do I get around this ?
> 
> Do I have to find out all the ways of representing any unsafe chars, and 
> replace/encode these?

If it helps, perl 5.6 supports unicode (or more precisely, utf8)
although the support isn't terribly complete. Stick a "use utf8" at
the head of your program, and things will start to work more according
to plan. Having said this, \w etc. should all work as expected even if
fed wide characters without any changes needing to be made[1]

I would expect Ruby to have some pretty impressive unicode
capabilities too, given its origins. I've not played with the language
much at all, but I know that you can start the interpreter expecting a
unicode source file (with "ruby -Ku", I believe)

Finally, rooting through the JDK 1.4's new regex classes, you can get
support for things like \p and \P similar to that provided by perl,
and Java's strings are stored as Unicode in any case.

Cheers,

Simon

[1] Cue Dean telling me I'm wrong.

-- 
Misspellers of the world untie!

-- 
Gllug mailing list  -  Gllug at linux.co.uk
http://list.ftech.net/mailman/listinfo/gllug




More information about the GLLUG mailing list