[Sussex] Moot Attendance List

Steve 'Dobbo' Dobson steve at dobson.org
Wed Jun 27 15:06:10 UTC 2007


On Wed, Jun 27, 2007 at 12:31:27PM +0100, Karl E. Jorgensen wrote:
> On Tue, Jun 26, 2007 at 09:25:13PM +0100, Steve 'Dobbo' Dobson wrote:
> > On Tue, Jun 26, 2007 at 04:21:39PM +0100, Karl E. Jorgensen wrote:
> > > On Tue, Jun 26, 2007 at 03:00:02PM +0100, Sussex wrote:
> > > > The following people have signed up to the Sussex Linux
> > > > User Group meeting on Thursday 28 June 2007.
> > > > 
> > > ...
> > > > Gavin Stevens
> > > > Karl JÃ??rgensen
> > > 
> > > I think that the SLUg Auto Meeting mailer doesn't like me. It seems to 
> > > suffer from a character set confusion...
> > 
> > I have to admit that when I designed the system I didn't concider
> > non-ASCII complainent names.
> 
> Lots of people don't :-| And I guess that for most people it wouldn't 
> actually matter either. But it usually catches me out
> 
> > > My name *did* look right on the HTML form, but this
> > > "Karl JÃ??rgensen" guy looks like my badly-translated UTF8 alter ego?
> > 
> > It's probably a configuration issue between the character coding of
> > the webpage, MySQL database and/or mail(1) which is used to sent out the
> > list.
> 
> I think you're on the right track here.
> 
> The web page http://www.sussex.lug.org.uk/moots.php is utf-8 according 
> to the server response headers:
>         HTTP/1.1 200 OK
>         Date: Wed, 27 Jun 2007 10:56:02 GMT
>         Server: Apache/2.0.53 (Fedora)
>         Connection: close
>         Content-Type: text/html; charset=utf-8
> 
> And (iirc) any form posted would follow the character set of the web 
> page (assuming that the browser can grok it).
> 
> > > Any way to fix that? 
> > 
> > Probably, but I don't know if the top of my head.  But here's the approprate
> > code it you want to have a go.
> > 
> > The table definition is as follows:
> >   CREATE TABLE `attendees` (
>       ...
> >       PRIMARY KEY  (`name`)
> >                            ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
> 
> latin1 is MySQL's name for iso-8859-1 - standard Western European 
> encoding.

> > A new name is inserted into the database with the following SQL:
> >   REPLACE INTO attendees (signedup, modified, created, name, attend, remote_IP)
> > .         VALUES (now(), now(), now(),
> >                   '${_POST[name]}', 'Y', '$_SERVER[REMOTE_ADDR]')
> 
> It would go wrong here: The name in the HTTP Post request would be utf-8 
> encoded - going straight into a latin1 encoded table.
> 
> Compliant browsers should obey the "enctype" attribute of the FORM 
> element:
>     http://www.w3.org/TR/html4/interact/forms.html#form-data-set
> which could be used for ensuring that the data are received by PHP in 
> e.g. iso8859-1 (or is it ISO-8859-1?  Extra hyphens and case do matter 
> sometimes...)
> 
> Unfortunately epiphany seems to be non-compliant in this respect... I 
> suspect the same is the case for Firefox.
> 
> > And the list is emailed using the following:
> >   ( echo "The following people have signed up to the Sussex Linux";
> >     ...
> >     echo "SELECT name FROM attendees WHERE  attend = 'Y' ORDER BY NAME;" |
> >         mysql;
> >     echo ;
> >     ...
> >   ) | mail -s "Moot Attendance List" sussex at mailman.lug.org.uk
> 
> A slightly different problem here: The mail (as it arrives on the list) 
> doesn't specify a character encoding. I suspect that this would make 
> most MUAs (at the receiving end) default to us7ascii - at least this 
> explanation matches what I see...
> 
> The "mail" command listed above doesn't give it a clue about the 
> character encoding for stdin, so it is understandable that it doesn't 
> add a content-type header on the outgoing mail.
> 
> If sussex.lug.org.uk uses /usr/bin/mail from GNU MailUtils, then
>     mail --append="content-type: text/plain; charset=utf-8"
> should make it clear to receiving MUAs that it is utf8.

That should be easy to do.

> This should work as far as the "attendance" mails are concerned, leaving 
> the latent problem of the database thinking it's latin1 when it's really 
> utf8 ...

I have no problem dropping the table and creating it afresh after this
months moot.  If the database table is also in UTF8 then that should 
fix the problem.  As a DBA [and the person with the problem name ;-)]
can you fine out what the setting is for UTF8 in MySQL 4.1.10a (which is
the version of MySQL on the lug.org.uk server)?

If I added the content-type header to the mail sent then the whole process
should be using UTF8 which gives it a much better chance of working.

> > If that's acceptable I'll remove the "Karl JÃ??rgensen"
> > entry from the database.
> 
> I'm happy for you to do that too.

Well we are close to a proper solution here, so lets see if we can get it
right first.

Steve
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.lug.org.uk/pipermail/sussex/attachments/20070627/d94c8001/attachment.pgp 


More information about the Sussex mailing list