[Sussex] Moot Attendance List

Karl E. Jorgensen karl at jorgensen.org.uk
Wed Jun 27 12:29:29 UTC 2007


On Tue, Jun 26, 2007 at 09:25:13PM +0100, Steve 'Dobbo' Dobson wrote:
> Hi Karl
> 
> On Tue, Jun 26, 2007 at 04:21:39PM +0100, Karl E. Jorgensen wrote:
> > Hi!
> > 
> > On Tue, Jun 26, 2007 at 03:00:02PM +0100, Sussex wrote:
> > > The following people have signed up to the Sussex Linux
> > > User Group meeting on Thursday 28 June 2007.
> > > 
> > ...
> > > Gavin Stevens
> > > Karl JÃ??rgensen
> > 
> > I think that the SLUg Auto Meeting mailer doesn't like me. It seems to 
> > suffer from a character set confusion...
> 
> I have to admit that when I designed the system I didn't concider
> non-ASCII complainent names.

Lots of people don't :-| And I guess that for most people it wouldn't 
actually matter either. But it usually catches me out

> > My name *did* look right on the HTML form, but this
> > "Karl JÃ??rgensen" guy looks like my badly-translated UTF8 alter ego?
> 
> It's probably a configuration issue between the character coding of
> the webpage, MySQL database and/or mail(1) which is used to sent out the
> list.

I think you're on the right track here.

The web page http://www.sussex.lug.org.uk/moots.php is utf-8 according 
to the server response headers:
        HTTP/1.1 200 OK
        Date: Wed, 27 Jun 2007 10:56:02 GMT
        Server: Apache/2.0.53 (Fedora)
        Connection: close
        Content-Type: text/html; charset=utf-8

And (iirc) any form posted would follow the character set of the web 
page (assuming that the browser can grok it).

> > Any way to fix that? 
> 
> Probably, but I don't know if the top of my head.  But here's the approprate
> code it you want to have a go.
> 
> The table definition is as follows:
>   CREATE TABLE `attendees` (
      ...
>       PRIMARY KEY  (`name`)
>                            ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

latin1 is MySQL's name for iso-8859-1 - standard Western European 
encoding.

> A new name is inserted into the database with the following SQL:
>   REPLACE INTO attendees (signedup, modified, created, name, attend, remote_IP)
> .         VALUES (now(), now(), now(),
>                   '${_POST[name]}', 'Y', '$_SERVER[REMOTE_ADDR]')

It would go wrong here: The name in the HTTP Post request would be utf-8 
encoded - going straight into a latin1 encoded table.

Compliant browsers should obey the "enctype" attribute of the FORM 
element:
    http://www.w3.org/TR/html4/interact/forms.html#form-data-set
which could be used for ensuring that the data are received by PHP in 
e.g. iso8859-1 (or is it ISO-8859-1?  Extra hyphens and case do matter 
sometimes...)

Unfortunately epiphany seems to be non-compliant in this respect... I 
suspect the same is the case for Firefox.

> And the list is emailed using the following:
>   ( echo "The following people have signed up to the Sussex Linux";
>     ...
>     echo "SELECT name FROM attendees WHERE  attend = 'Y' ORDER BY NAME;" |
>         mysql;
>     echo ;
>     ...
>   ) | mail -s "Moot Attendance List" sussex at mailman.lug.org.uk

A slightly different problem here: The mail (as it arrives on the list) 
doesn't specify a character encoding. I suspect that this would make 
most MUAs (at the receiving end) default to us7ascii - at least this 
explanation matches what I see...

The "mail" command listed above doesn't give it a clue about the 
character encoding for stdin, so it is understandable that it doesn't 
add a content-type header on the outgoing mail.

If sussex.lug.org.uk uses /usr/bin/mail from GNU MailUtils, then
    mail --append="content-type: text/plain; charset=utf-8"
should make it clear to receiving MUAs that it is utf8.

This should work as far as the "attendance" mails are concerned, leaving 
the latent problem of the database thinking it's latin1 when it's really 
utf8 ...

> > PS: Changing my surname is not an option. Really...
> 
> But I notice that you have an ASCII complient form of your name so why
> not use that?  

If you mean "Jorgensen", then yes. Sort of.

The letter 'ø' is sadly missing in the  UK - although the vowel it 
represents is often used Yorkshire: Ever heard a Yorkshire man talking 
about "Errors"?  They pronounce it 'Ørrers' :-) Might be some Viking 
influence? Dunno...

> If that's acceptable I'll remove the "Karl JÃ??rgensen"
> entry from the database.

I'm happy for you to do that too.

Enjoy!
-- 
Karl E. Jorgensen
karl at jorgensen.org.uk  http://www.jorgensen.org.uk/
karl at jorgensen.com     http://karl.jorgensen.com
==== Today's fortune:
Absurdity, n.:
	A statement or belief manifestly inconsistent with one's own opinion.
		-- Ambrose Bierce, "The Devil's Dictionary"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.lug.org.uk/pipermail/sussex/attachments/20070627/f449fea5/attachment.pgp 


More information about the Sussex mailing list