List:General Discussion« Previous MessageNext Message »
From:Rick James Date:August 27 2012 5:27pm
Subject:RE: MySQL, UTF8 and collations
View as plain text  
There is a section on German Sharp-s in http://mysql.rjweb.org/doc.php/charcoll
I agree with "dirty".
As I understand it, about 5.1.24, they said "Oops, sharp-s is collated wrong, let's fix it".  The fix broke things, but they stuck by the "correct" sorting.  Eventually they said "Oh, let's add another, compatible, collation".

They were caught between a rock and a hard place.

> -----Original Message-----
> From: Johan De Meersman [mailto:vegivamp@stripped]
> Sent: Monday, August 27, 2012 7:43 AM
> To: mysql
> Subject: MySQL, UTF8 and collations
> 
> Hey,
> 
> We're upgrading MySQLs from 5.0 to 5.5, and running into the
> predictable utf8_general_ci bu^Wwe-meant-to-do-that with german ß and
> similar characters. The server swallowed the existing datafiles (thank
> god for that) so we're up and running, but "check table" does whine
> about needing a full rebuild, which is clearly that issue.
> 
> Now, I'm aware that this should be fixed by using the utf8_unicode_ci
> collation; but somehow it isn't...
> 
> The its-not-a-bug-but-heres-the-fix collation utf8_general_mysql500_ci
> does work, but honestly I feel that that's kind of dirty.
> 
> Any ideas why utf8_unicode_ci doesn't work as expected? The new server
> version is 5.5.24-1~dotdeb.1-log.
> 
> Thx,
> Johan
> 
> 
> mysql> set names UTF8;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> show create table search_index;
> +--------------+-------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -------------------------------------------------------------+
> | Table | Create Table |
> +--------------+-------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -------------------------------------------------------------+
> | search_index | CREATE TABLE `search_index` (
> `word` varchar(50) NOT NULL DEFAULT '',
> `sid` int(10) unsigned NOT NULL DEFAULT '0',
> `type` varchar(16) DEFAULT NULL,
> `score` float DEFAULT NULL,
> UNIQUE KEY `word_sid_type` (`word`,`sid`,`type`),
> KEY `sid_type` (`sid`,`type`),
> KEY `word` (`word`)
> ) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
> +--------------+-------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -----------------------------------------------------------------------
> -------------------------------------------------------------+
> 1 row in set (0.00 sec)
> 
> mysql> show table status like "search_index";
> +--------------+--------+---------+------------+---------+-------------
> ---+-------------+-----------------+--------------+-----------+--------
> --------+---------------------+-------------+------------+-------------
> ----+----------+----------------+---------+
> | Name | Engine | Version | Row_format | Rows | Avg_row_length |
> Data_length | Max_data_length | Index_length | Data_free |
> Auto_increment | Create_time | Update_time | Check_time | Collation |
> Checksum | Create_options | Comment |
> +--------------+--------+---------+------------+---------+-------------
> ---+-------------+-----------------+--------------+-----------+--------
> --------+---------------------+-------------+------------+-------------
> ----+----------+----------------+---------+
> | search_index | InnoDB | 10 | Compact | 6893296 | 58 | 405536768 | 0 |
> 725172224 | 255852544 | NULL | 2012-08-27 14:53:20 | NULL | NULL |
> utf8_general_ci | NULL | | |
> +--------------+--------+---------+------------+---------+-------------
> ---+-------------+-----------------+--------------+-----------+--------
> --------+---------------------+-------------+------------+-------------
> ----+----------+----------------+---------+
> 1 row in set (0.07 sec)
> 
> mysql> CREATE TABLE `search_johan` ( `word` varchar(50) NOT NULL
> DEFAULT '', `count` float DEFAULT NULL, PRIMARY KEY (`word`) )
> ENGINE=InnoDB DEFAULT CHARSET=utf8 collate=utf8_unicode_ci;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> insert into search_johan select * from search_total;
> ERROR 1062 (23000): Duplicate entry 'cœur' for key 'PRIMARY'
> 
> 
> 
> 
> --
> 
> Linux Bier Wanderung 2012, now also available in Belgium!
> August, 12 to 19, Diksmuide, Belgium - http://lbw2012.tuxera.be
Thread
MySQL, UTF8 and collationsJohan De Meersman27 Aug
  • RE: MySQL, UTF8 and collationsRick James27 Aug
    • Re: MySQL, UTF8 and collationsJohan De Meersman28 Aug
      • RE: MySQL, UTF8 and collationsRick James28 Aug
      • Re: MySQL, UTF8 and collationsShawn Green28 Aug