List:Internals« Previous MessageNext Message »
From:Olaf van der Spek Date:June 10 2005 8:02pm
Subject:Re: Text Encoding
View as plain text  
On 6/10/05, Sergei Golubchik <serg@stripped> wrote:
> Hi!
> 
> On Jun 10, Hagen Höpfner wrote:
> > Sergei Golubchik schrieb:
> > >>
> > >>If I have a table with an utf8 coded char(1) attribute and insert
> > >>the value "a" this is stored as 0x61 20 20. The output of "SHOW
> > >>CHARACTER TYPES" mentioned a maximal length. However, it seems, that
> > >>this maximal length is used even if one byte (like here) would be
> > >>enough? I think, this is based on the idea of handling possible
> > >>updates (e.g.. a->?) more efficient, but why do you call the
> > >>character length to be maximal if it is used all the time?
> > >
> > >Check the manual for the difference between CHAR and VARCHAR (in
> > >Column types), and static vs. dynamic row format (in MyISAM section)
> > >
> > That's not the question ;-) I know about varchar / char columns. What I
> > am wondering about is the lenght of an attribut value not of a row. An
> > Latin1-coded "a" requires one Byte 0x61 ... if I use UTF8 instead it
> > requires 3 Bytes 0x61 20 20 . I know that the original UTF-8 enconding
> > allowes various byte numbers for various kinds of symbols. An "Ä", for
> > example, requires 2 Bytes to be represented in UTF-8 . In the
> > SHOW-CHARACTER-TYPES-Output the lenght for ONE Character in different
> > codings is shown. In fact, UTF-8 requires (in theory) up to max 3 Bytes,
> > thats correct. But the real stored UTF-8-"a" uses 3 Bytes. So, why do
> > you call the length to be maximal 3 Bytes if it is alway 3 Bytes?
> 
> CHAR() columns always have the same length - in bytes.
> As "a" takes only one byte it's space-padded to 3 bytes.

And then it's truncated to one character when it's retrieved again?
Thread
Text EncodingHagen Höpfner9 Jun
  • Re: Text EncodingSergei Golubchik10 Jun
    • Re: Text EncodingHagen Höpfner10 Jun
      • Re: Text EncodingSergei Golubchik10 Jun
        • Re: Text EncodingHagen Höpfner10 Jun
          • Re: Text EncodingSergei Golubchik10 Jun
            • Re: Text EncodingOlaf van der Spek10 Jun
              • Re: Text EncodingSergei Golubchik11 Jun