[sclug] URL encoding/decoding question

pieter claassen pieter at claassen.co.uk
Sun Feb 19 15:58:38 UTC 2006


Hi,

I am trying to edit some database stored fields containing HTML with a
JSP page.

I store the data URL encoded in UTF-8 in the db and decode it before
rendering with the exception of URL parameter strings that contain HTML
that are re-encoded so that they don't break the rendering of links in
the browser.

My questions:
1. I assume it makes no difference whether I store data encoded or not
in the DB? The reason I went for encoding was in case there were some
values that would screw the SQL insertion up (like "). Encoding and
decoding a string should result in exactly the same value?
2. For some reason when I try to encode the " % " characters (space%
space), I get an encoded value of "+%25+" in the database but when I try
to decode this value, I get:

java.lang.IllegalArgumentException: URLDecoder: Incomplete trailing
escape (%) pattern

3. A big problem is the encoding of € strings which give me 
"%26euro%3B" in the database and is then rendered by the browser inside
a textarea block as ? (the euro sign). The problem is that if I encode
this symbol then I get "%C3%A2%C2%82%C2%AC" which in return encodes to
"%C3%83%C2%A2%C3%82%C2%82%C3%82%C2%AC".

How do I get text in the text area to be decoded to HTML values and then
re-encoded before insertion in the DB to the same UTF-8 value? Whey does
this happen?


The java encoding and decoding calls are:

URLEncoder.encode(data,"UTF-8");
URLDecoder.decode(data,"UTF-8");

The page encoding:
<%@ page contentType="text/html; charset=UTF-8" %>

The bottom line is that if I decode HTML, view it in a textarea and
re-encode it, that it is not the same as it was before.

Any comments appreciated.

Cheers,
Pieter



More information about the Sclug mailing list