List:General Discussion« Previous MessageNext Message »
From:Benjamin Pflugmann Date:April 6 1999 6:02pm
Subject:Re: mysql: analyzing debug output
View as plain text  
Hi!

I saw no reply to the bug report cited below and now it is on the todo
list. I did not regard the behaviour as bug, but as reasonable and am
interested to see, how the problem should be solved precisely.

The bug report was (somewhat shortened):

------------------------------snip------------------------------
Date: 08-Mar-1999 15:29:36 
From: rufus@@uos.de
Subject: mysql: 'select distinct' bug with 'order by' 

 >Description:
         select distinct a from foo order by b;
         returns duplicate rows, if there are different pairs of (a,b)
 >How-To-Repeat:
         create table foo (a int,b int);
         insert foo values (1,1);
         insert foo values (1,2);
         insert foo values (1,3);
         insert foo values (1,3);
         insert foo values (1,3);
         select a from foo; # returns five rows, ok
         select distinct a from foo; # returns one row, ok
         select distinct a from foo order by a; # returns one row, ok
         select distinct a from foo order by b; # returns three rows!!!
 >Fix:
         no fix known.
------------------------------snip------------------------------

Now: Imagine the following INSERTs (additionally to the ones listed
above):

        insert foo values (2,3);
        insert foo values (2,3);
        insert foo values (2,2);
        insert foo values (2,1);

What should be the output?

        select distinct a from foo order by b;

The author of the bug report apparantly expected something like:
		a
		1
		2

But should it be 1,2 or 2,1? Is this defined in SQL92?  Which 'b's
have to be taken to order this? Before making it distinct (but after
ordering) it is something like (hm, has the ordering to keep the order
of equal items? AFAIK, MySQL does, but does SQL92 require this?):
		a	b
		1	1
		2	1
		1	2
		2	2
		1	3
		1	3
		1	3
		2	3
		2	3

Ok, if we make them distinct now it is easily possible to create
either 1,2 or 2,1 as result based on the algorithm you use... (it's
the same problem you have with GROUP BY, if you SELECT a value (not
SUM or so) of a column which not listed in GROUP; MySQL takes the
first or a random one on this?)

Is it specified how to do the distinct on this? If not, I would say,
it is more reasonable to not allow to do an ORDER BY on a value, which
is not in the SELECT (i.e. in the DISTINCT). This means, you would
have to do

	select distinct a,b from foo order by b;

The result would be something like
		a	b
		1	1
		2	1
		1	2
		2	2
		1	3
		2	3

If you only look at 'a' (since you don't want the b value), it is
exactly, what MySQL returns now for 'select distinct a from foo order
by b', so I would call it expected behaviour :-)

I would either keep the behaviour it has now or forbid ordering on
not-selected columns. Except if it is specified in SQL92 that
something else should happen.

Just my 2 pence :)

Bye,

	Benjamin.

PS: Are the SQL specs available on the net somewhere? I would
appreciate any hint.
  
-- 
Benjamin Pflugmann aka Philemon                       philemon@stripped
Programming, Guestbook support                voice: +49 941 94 65 939
SPiN GmbH             http://www.spin.de        fax: +49 941 94 65 938
======= web design - java chats - guestbooks - java/cgi coding =======
Thread
Re: mysql: analyzing debug outputBenjamin Pflugmann7 Apr