From: Martin Ramsch Date: March 17 1999 10:32am Subject: Re: DISTINCT weirdness.. List-Archive: http://lists.mysql.com/mysql/396 Message-Id: <19990317113245.A22281@forwiss.uni-passau.de> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Wow! Monty, again and again I'm quite impressed how fast you're answering ... I think the great support by you and the mailing list is a very big advantage of MySQL that most other products are lacking. On Mo, 1999-03-15 20:46:33 +0200, Michael Widenius wrote: > Martin> So if sorting treats 'a' and 'ä' (a umlaut) as the same, > Martin> then also 'Bar' and 'Bär' are seen as duplicates. > > The above isn't completely true: 'Bar' and 'Bär' are treated as > distinct values , while 'BAR' and 'bar' isn't distinct Sorry, this was written with configure option '--with-charset=german1' in mind (which is how I compiled our MySQL server), I should have noted this (or put more emphasis on the _if_ :-). > In MySQL 3.23 you will be able to do: > > SELECT DISTINCT BINARY text_column FROM TABLE; > > (The BINARY attribute casts the text_column to a binary column, that > is sorted / compared according to the ASCII values for the individual > characters) Very good, that helps a lot to deal more comfortable with the "unwanted duplicates" problem! > Martin> I'm not quite sure, if this behaviour is a bug or an intended (but -- > Martin> in my opinion -- mis-designed) feature. [...] > I appreciate any ideas how to do it better. The problem is to make > everything 'hold' together. I think that the DISTINCT should compare > the strings the same way as the normal compare operations, and this > put some restrictions how things can be solved. Maybe it's overkill for the typical applications MySQL is used for, but you could take the SQL3 standard as a guideline, which defines collating sequences and their usage (BTW, my source of information is ). But for the time being I think it's great, if one just can add the keyword "BINARY" to an expression to make it use the "binary" ordering. What then still misses is a way to put binary columns into the current sorting order mode, e.g. new keywords "NATIONAL CHARACTER" or so. Regards, Martin -- Martin Ramsch PGP KeyID=0xE8EF4F75 FiPr=52 44 5E F3 B0 B1 38 26 E4 EC 80 58 7B 31 3A D7