MySQL Lists are EOL. Please join:

List:MySQL and Perl« Previous MessageNext Message »
From:Dominic Mitchell Date:February 25 2006 7:13pm
Subject:Re: UTF-8 support in DBD::mysql
View as plain text  
Jan Kratochvil said:
> Hi,
>
> On Sat, 25 Feb 2006 11:33:08 +0100, Dominic Mitchell wrote:
> ...
>> It is a hack, but it's a useful one.
> ...
>> You could get into long details about the correct API for transcoding
>> automatically into the desired charset from whatever charset the
>> database
>> has stored your data in.  But it smacks of overengineering, and not
>> making
>> the common case simple.
>
> I was checking now that utf-8 looks really complicated enough to not to be
> fooled by random data as "false positive". I can report that my engine was
> getting MMSE (MMS Encapsulation - mobile phones binary format) data marked
> as utf-8 (and therefore failing binary decoding of bytes-oriented MMSE).

Well, UTF-8 is designed so that the longer the string, the smaller chance
of something which is not UTF-8 being identified as UTF-8.  The wikipedia
article explains this better.

http://en.wikipedia.org/wiki/UTF-8#Advantages_and_disadvantages

Also, are you talking about data going into MySQL?  I'm not actually
concerned about that, only about retrieveing it on the way out.

> Maybe I did there some other mistake or that previously attached patch of
> mine
> is broken (still does not look so to me).  Still the hassle around and
> unpredictable behavior on possibly random failing service for the clients
> prevented me from using it for real.

Well, it behaviour should be hidden behind a flag, so it can be turned off
if needed.  I'm not proposing that it's enabled by default.

> ...
>> > In fact I gave up and rather mark it utf-8 by hand from Perl when
>> > appropriate.
>>
>> That's exactly what I *don't* want to be doing.  I gave it UTF-8 -- it
>> should be able to give me UTF-8 back.
>
> You gave it utf-8 marker when it was really utf-8. It should give back
> utf-8
> marker when it is really utf-8.

I think that's what I'm proposing.  :-)

-Dom

Thread
UTF-8 support in DBD::mysqlDominic Mitchell24 Feb
  • Re: UTF-8 support in DBD::mysqlJan Kratochvil24 Feb
    • Re: UTF-8 support in DBD::mysqlDominic Mitchell25 Feb
      • Re: UTF-8 support in DBD::mysqlJan Kratochvil25 Feb
        • Re: UTF-8 support in DBD::mysqlDominic Mitchell25 Feb
  • [PATCH] Re: UTF-8 support in DBD::mysqlDominic Mitchell7 Mar