From: Kevin Lewis Date: February 18 2009 8:15pm Subject: Re: Patch for bug#42208 List-Archive: http://lists.mysql.com/falcon/556 Message-Id: <499C6C74.1050609@sun.com> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=windows-1252 Content-Transfer-Encoding: 7BIT Ann W. Harrison wrote: > Kevin Lewis wrote: >> >> Well, here is the MySQL (read MyISAM) implemetation; >> >> 1) BINARY fixed length fields are always padded out to the fixed >> length. Thus for a BINARY(5) column; 0x00, 0x0000, 0x000000 and >> 0x00000000 are all equal because they are all stored and sorted as >> 0x0000000000. >> >> 2) VARBINARY fields must take into account the actual length. >> Thus for a VARBINARY column; 0x00, 0x0000, 0x000000 and 0x00000000 are >> all different and sorted by length. >> > > And if you compare a binary with a varbinary? Are constants fixed > or varying? The are never equal unless the VARBINARY has exactly the number of bytes and the same bytes as the BINARY. This means that a join of a BINARY(5) column with a BINARY(6) will never find anything. :) > I realize that the MyISAM implementation is definitive by definition, > but I wish it had been better thought out. At least with strings, > 'a' = 'a', regardless of whether they're stored as char(20) or > varchar(20). Both CHAR and VARCHAR trim 0x00 and the space character for the CHARSET before comparison. So 'a' = 'a' whether CHAR(5) or VARCHAR. But now that you ask, here is the previous test using CHAR and VARCHAR. Just like with BINARY and VARBINARY, it does OK until it does a LIMIT query. Then the NULL is mixed up with the zeros (which are and should be equal to each other, but not equal to null). mysql> drop table t1; Query OK, 0 rows affected (0.09 sec) mysql> create table t1 (f1 CHAR(5), f2 VARCHAR(5), f3 char(10), -> key(f1), key(f2), key(f1,f2,f3)) engine=falcon; Query OK, 0 rows affected (0.06 sec) mysql> insert into t1 values ('B', 'B', 'B'); Query OK, 1 row affected (0.02 sec) mysql> insert into t1 values ('A', 'A', 'A'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (0x0000000000, 0x0000000000, 'five'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (0x00000000, 0x00000000, 'four'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (0x000000, 0x000000, 'three'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (0x0000, 0x0000, 'two'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (0x00, 0x00, 'one'); Query OK, 1 row affected (0.02 sec) mysql> insert into t1 values ('', '', 'none'); Query OK, 1 row affected (0.00 sec) mysql> insert into t1 values (null, null, 'null'); Query OK, 1 row affected (0.00 sec) mysql> select f1, hex(f1), f2, hex(f2), f3 from t1; +-------+------------+-------+------------+-------+ | f1 | hex(f1) | f2 | hex(f2) | f3 | +-------+------------+-------+------------+-------+ | B | 42 | B | 42 | B | | A | 41 | A | 41 | A | | | 0000000000 | | 0000000000 | five | | | 00000000 | | 00000000 | four | | | 000000 | | 000000 | three | | | 0000 | | 0000 | two | | | 00 | | 00 | one | | | | | | none | | NULL | NULL | NULL | NULL | null | +-------+------------+-------+------------+-------+ 9 rows in set (0.00 sec) mysql> select f1, hex(f1), f2, hex(f2), f3 from t1 order by f1; +-------+------------+-------+------------+-------+ | f1 | hex(f1) | f2 | hex(f2) | f3 | +-------+------------+-------+------------+-------+ | NULL | NULL | NULL | NULL | null | | | 0000000000 | | 0000000000 | five | | | 00000000 | | 00000000 | four | | | 000000 | | 000000 | three | | | 0000 | | 0000 | two | | | 00 | | 00 | one | | | | | | none | | A | 41 | A | 41 | A | | B | 42 | B | 42 | B | +-------+------------+-------+------------+-------+ 9 rows in set (0.00 sec) mysql> select f1, hex(f1), f2, hex(f2), f3 from t1 order by f2; +-------+------------+-------+------------+-------+ | f1 | hex(f1) | f2 | hex(f2) | f3 | +-------+------------+-------+------------+-------+ | NULL | NULL | NULL | NULL | null | | | 0000000000 | | 0000000000 | five | | | 00000000 | | 00000000 | four | | | 000000 | | 000000 | three | | | 0000 | | 0000 | two | | | 00 | | 00 | one | | | | | | none | | A | 41 | A | 41 | A | | B | 42 | B | 42 | B | +-------+------------+-------+------------+-------+ 9 rows in set (0.00 sec) mysql> select f1, hex(f1), f2, hex(f2), f3 from t1 order by f1,f2,f3 ; +-------+------------+-------+------------+-------+ | f1 | hex(f1) | f2 | hex(f2) | f3 | +-------+------------+-------+------------+-------+ | NULL | NULL | NULL | NULL | null | | | 0000000000 | | 0000000000 | five | | | 00000000 | | 00000000 | four | | | 000000 | | 000000 | three | | | 0000 | | 0000 | two | | | 00 | | 00 | one | | | | | | none | | A | 41 | A | 41 | A | | B | 42 | B | 42 | B | +-------+------------+-------+------------+-------+ 9 rows in set (0.00 sec) mysql> select f1, hex(f1), f2, hex(f2), f3 from t1 order by f1,f2,f3 limit 7 ; +-------+------------+-------+------------+-------+ | f1 | hex(f1) | f2 | hex(f2) | f3 | +-------+------------+-------+------------+-------+ | | 0000000000 | | 0000000000 | five | | | 00000000 | | 00000000 | four | | | | | | none | | NULL | NULL | NULL | NULL | null | | | 00 | | 00 | one | | | 000000 | | 000000 | three | | | 0000 | | 0000 | two | +-------+------------+-------+------------+-------+ 7 rows in set (58.41 sec) > Vlad's suggestion for a reimplementation of multi-segment > keys with suffix compression solves the problem and lets > us work like MyISAM. In spite of that, it seems like a good > idea. I think it does. But we have to continue to distinguish between a VARBINARY and a VARCHAR or CHAR, as we do now with MySQLCollation::isBinary, so that we pack the right number of zeros.