[sclug] URL encoding/decoding question

Roland Turner SCLUG raz.fpyht.bet.hx at raz.cx
Thu Feb 23 15:24:34 UTC 2006


On Thu, 2006-02-23 at 14:07 +0000, Pieter Claassen wrote:
> 
> > Umm, how long ago; before or after reloading the database to change
> > encoding?
> >
> This was before reloading it so you might be right that an alter statement
> might not be enough and that data will have to be reloaded.

{{ I understand that at this exact second, stuff is broken; my next few
questions relate to the prior period. }}

OK, this doesn't make sense. If you had already switched to BLOB, then
why did you care what interpretation the database was applying? What
would be the point in switching the character encoding in the on-disk
files and in playing with session character set/encoding parameters if
you were already using BLOB anyway?

I also thought you said that you had actually reloaded. It should not
have been possible to switch from TEXT to BLOB by having an ALTER TABLE
change the type of a column without reloading the table anyway. Can we
make certain that we're talking about the same thing please:

- You are now (and have been for at least a couple of steps) storing the
data as a BLOB?

- You are now handing the data to, and receiving it from, the database
as sequences of bytes (byte[]/InputStream/OutputStream) not sequences of
characters (char[]/String/Reader/Writer)?

> The issue at the moment is that mysql is broken and I cannot recreate the
> round trip until I have fixed it.

Understood.

> I upgraded mysql to 4.1 because it was
> clear that it had better unicode support that 4.0 (which under
> ubuntu/debian does not have utf8 support compiled in).

OK, this may have been an unfortunate choice. Given that you're already
using BLOB, you _really_ don't care about the database's character
set/encoding support.

This doesn't mean that you can't make things worse by tinkering
though :-)

> Tomcat seems to use preparedstatement internally for authentication so
> there is no real option other than solving the problem.
> 
> 
> > You've misunderstood, it's not a question of not _supporting_ it,
> > prepared statement execution is typically _faster_ than than using the
> > query cache and, as a result, MySQL doesn't even refer to its query
> > cache when you are executing a prepared statement.
> >
> 
> As I mentioned above, mysql-server  4.1.12-1ubuntu3.1 with
> mysql-connector-java-[3.1.12|5.0.0-beta|5.0.0-nightly-20060223] fails to
> substitute the variable values in the preparedstatement line.

OK. Have you done this with a minimal test? In particular, have you done
it where you are _not_ tweaking any character set/encoding related
settings on the database? It is entirely possible that you've changed
the query encoding behind the driver's back and that it is therefore
mis-recognising the characters in the query and thus unable to do
anything with it.

Is there any reason for not reverting to MySQL 4.0? Is there any reason
that you're not using a mysql-connect-java-4.1.* with 4.1?

> setString(...) fails.

What error/exception are you getting when you call setString()? (If
none, then it isn't failing.)

> I suspect that it is because it is caching the
> prepare statement  and then when executing it (pulls the statement from
> the cache)

You appear to be mixing two seperate concepts.

- MySQL's query cache is a server-side mapping from literal
non-parametric query text to a result set. To use the query cache, the
entire text of the query must be transmitted to the server on each
invocation, must be character-for-character identical to one that is
already in the cache and must not have been invalidated by an update to
one or more source tables.

- MySQL also maintains prepared (compiled) statements, parametric or
otherwise, per-session. The documentation does not indicate that result
set is cached. When you execute a PreparedStatement, only a handle is
sent to the server, plus a binary presentation of any arguments required
to satisfy the parameters.

My point is that "executing it (pulls the statement from the cache)"
suggests to me that you may be confusing the query cache with the
per-session collection of prepared statements. The documentation does
not appear to use the word "cache" when describing the latter.

> tries to execute the literal SQL query.

Do you have any evidence for this? (You've not provided any error
messages.)

> The MySQL recommended
> methods to disable sql caching still results in mysql reporting that the
> cach is enabled but just 0 bytes large. Not sure what that means.

Presumably an implementation hack. If it's size is 0, it can't very well
be holding anything.

> 9 Prepare     [1] SELECT SQL_NO_CACHE data FROM content WHERE lang=?
> 9 Execute     [1] SELECT SQL_NO_CACHE data FROM content WHERE lang=?

I'm not sure what you're showing here. Is it possible that you're
assuming that because the query is being logged in parametric form that,
therefore, the parameters have not been satisfied? If so, note that this
is not possible; if there are unsatisfied parameters, execute() will
fail and tell you that this is so.

> > I don't understand why you are raising this now though; what has it to
> > do with your character encoding problem?
> 
> 
> > Where are those Ethereal dumps? These will tell us exactly what's going
> > on, rather than messing about with random tinkering. (Empirical method:
> > measure first.)
> 
> I will produce them (and have been trying to) as soon as I have a db that
> can do preparedstatement otherwise they are meaningless.
> 
> I just tested and the problem does not exist in MySQL 5.0.18.

Does this mean that we can get back to solving your problem now?

(Please, no more arbitrarily replacing the building's foundations while
we're trying to debug the alarm system :-))

- Raz



More information about the Sclug mailing list