[sclug] Character encoding

Sean Furey sean-lists-sclug at furey.me.uk
Thu Dec 7 14:54:47 UTC 2006


On Thu, Dec 07, 2006 at 02:44:27PM +0000, David Given wrote:
> When I have a web page containing a form, and the user types in some text and
> the form gets submitted, what encoding is the text in?
> 
> The answer, as far as I can work out, is 'whatever the user's web browser was
> set to'. And it *doesn't* tell you what that is. Which means, given that the
> user can change the encoding any any point, it's impossible to tell what
> character set the submitted text is in.
> 
> Am I wrong, or is this a horrific hole in the HTTP spec?

It'd be nice to answer saying that you're wrong because of <foo>, but
you're right, it's a bit of a mess.

http://ppewww.physics.gla.ac.uk/~flavell/charset/form-i18n.html has lots
about it, but it all boils down to "send UTF-8 out, expect UTF-8 back.
If it breaks it breaks".  This covers most browsers, and there's simply
no useful solution for those that it doesn't.

Sean


More information about the Sclug mailing list