Hi!
On Jun 10, Hagen Höpfner wrote:
> Sergei Golubchik schrieb:
> >>
> >>If I have a table with an utf8 coded char(1) attribute and insert
> >>the value "a" this is stored as 0x61 20 20. The output of "SHOW
> >>CHARACTER TYPES" mentioned a maximal length. However, it seems, that
> >>this maximal length is used even if one byte (like here) would be
> >>enough? I think, this is based on the idea of handling possible
> >>updates (e.g.. a->?) more efficient, but why do you call the
> >>character length to be maximal if it is used all the time?
> >
> >Check the manual for the difference between CHAR and VARCHAR (in
> >Column types), and static vs. dynamic row format (in MyISAM section)
> >
> That's not the question ;-) I know about varchar / char columns. What I
> am wondering about is the lenght of an attribut value not of a row. An
> Latin1-coded "a" requires one Byte 0x61 ... if I use UTF8 instead it
> requires 3 Bytes 0x61 20 20 . I know that the original UTF-8 enconding
> allowes various byte numbers for various kinds of symbols. An "Ä", for
> example, requires 2 Bytes to be represented in UTF-8 . In the
> SHOW-CHARACTER-TYPES-Output the lenght for ONE Character in different
> codings is shown. In fact, UTF-8 requires (in theory) up to max 3 Bytes,
> thats correct. But the real stored UTF-8-"a" uses 3 Bytes. So, why do
> you call the length to be maximal 3 Bytes if it is alway 3 Bytes?
CHAR() columns always have the same length - in bytes.
As "a" takes only one byte it's space-padded to 3 bytes.
Try, e.g. VARCHAR(1) (be sure to force the table into dynamic row
format), and you'll see that "a" will take only one byte (length
excluded), it won't be space-padded.
If you'd have CHAR(2), and insert "aa", you'd see "aa ",
and not "a a ".
Regards,
Sergei
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Osnabrueck, Germany
<___/ www.mysql.com