>>>>> "Benjamin" == Benjamin Pflugmann <philemon@stripped> writes:
Benjamin> Let me start with a little clarification on my motivation:
Benjamin> The problem from my point of view is less that I could 'miss' a NULL
Benjamin> where I test for it, but rather that the results can be inconsistent
Benjamin> depending on ordering of my expression. I would prefer to not have
Benjamin> NULL results in boolean expressions instead of having - even a small -
Benjamin> possibility for inconsitency.
Note that if you read things as 'true' and 'not true', then the
current version will work perfectly.
Benjamin> Such things are the starts for hours of debugging (Why does a program
Benjamin> sometimes crash and sometimes not, using the same result set and
Benjamin> obviously the same logical condition and such stuff).
I can't really think of any real world query that would crash because
of this. It's not normal to have a AND/OR condition that is tested
only for NULL. As most SQL databases doesn't even support this, the
query will be quite MySQL specific anyway and the current behaveour is
now cleary documented.
Benjamin> On Fri, Feb 04, 2000 at 03:34:17PM +0200, monty@stripped wrote:
Benjamin> The statement that the result of NULL AND 0 is UNKNOWN simply isn't
Benjamin> true, since it is 0.
Benjamin> Therefore I would suggest to stop parsing only after you know the
Benjamin> expression is going to be 0, but not in case of NULL.
>> (What do you mean with 'not in the case of NULL' ?; Please read forward
>> before answering this).
Benjamin> Sorry, but I don't understand you question - actually I do in some
Benjamin> kind, but I didn't not find what you refer forward to, so I assume, I
Benjamin> did not understand.
>> Note that doing this generally would give a huge performance problem
>> In almost all cases a user only cares if a condition is true or not.
Benjamin> Ok. Meanwhile I understand this objection better. It might look as
Benjamin> there are few cases for performance problems, but they might catch you
Benjamin> at unexpected places.
Benjamin> I see this as a big problem: If I construct the WHERE statement in a
Benjamin> program, I can get different results on the ordering of my WHERE
Benjamin> clause, which should not be.
Benjamin> Well, I said WHERE statement, but when I thought about results, I
Benjamin> actually meant expressions in SELECT statements. I did not foresee
Benjamin> that this would matter.
Can you give an example of a expression like this? My main point here
is that I am trying to avoid that everything gets slower just because
we are trying to avoid something that no one will ever use.
Note that even if this gets fixed, then we will instead get the problem that
depending on the order of the AND/OR clause things will get MUCH slower.
Benjamin> I agree, with WHERE clauses it seldom counts whether you have NULL or
Benjamin> 0. Or in other words: With the end result of a WHERE clause, this
Benjamin> difference does not matter. It might (as you show below) matter within
Benjamin> other expressions.
>> No, you shouldn't get this (if you don't agree, please provide a
>> contradicting example);
Benjamin> Well, it would have been the ISNULL example below.
Benjamin> But my actual example is such an expressions in a SELECT statement:
Benjamin> SELECT (some flag AND some possible-null-flag) FROM table;
Benjamin> I don't have the first mail at hand, and don't know for sure, whether
Benjamin> this optimation is done with every expression (I would expect that
Benjamin> there is only one functionality for expressions). If not, my main
Benjamin> objection is void.
Currently it's done for every expression.
Benjamin> In the above case, the performance difference can be measured, but it
Benjamin> should not be as bad as it would be for a WHERE clause.
For a normal query, this is true. For a query involving sub selects
its not true. The problem is still to add some general code to quickly
find out which AND/OR one can optimize and which ones one can't.
I have put this on my todo and will look at it at some convenient time.
For example in the case of:
SELECT 0>1 and (SELECT ....)
Assuming that the sub select can return NUUL, the SQL server has to do
a LOT of extra work and the user can't easily specify if he only
want's to know the true/not true value or also the true/not true/null
>> 0 is the same as false in MySQL and there isn't any use to continue
>> with parsing AND when you already have found FALSE as you know the
>> answer can't be true (and in a WHERE you only care about if the
>> answer is true or not).
Benjamin> You meant NULL, not 0, in the above sentence, didn't you? If not, I
Benjamin> don't understand what you want to say and how this relates to NULL
Benjamin> values. Else, I agree, of course.
No, I meant 0.
mysql> select 0>1;
| 0>1 |
| 0 |
mysql> select 0>1 and 1>2 and 2>3 and 5>5;
| 0>1 and 1>2 and 2>3 and 5>5 |
| 0 |
1 row in set (0.00 sec)
mysql> select 0>1 and 1>2 and 2>3 and 5>5 and NULL;
| 0>1 and 1>2 and 2>3 and 5>5 and NULL |
| 0 |
1 row in set (0.01 sec)
The problem is the last case; To be 100 % correct the answer should be NULL.
>> Note that the speed impact will come with ALL queries that uses AND ;
>> In many cases the SQL engine would even have to pull out unecessary rows from
>> reference tables just to be able to calculate if the value is 0 or
>> null when all the user care for is if it's true or not
>> The way to fix this is of course to add some extra logic into the
>> WHERE handling to let MySQL notice the cases where we don't actually
>> care for the exact value of the AND/OR operation, but this isn't
Benjamin> Hm. Since only only ISNULL,IFNULL and some brothers make use of such
Benjamin> three-state logic, one would have to do full evaluation on 'sub
Benjamin> expressions' of these (I mean where the result of a expression is used
Benjamin> with these) in WHERE clauses, and everywhere else (not within WHERE:
Benjamin> INSERT INTO / SELECT expr / UPDATE / variable assignment)...
Don't forget sub selects like my earlier example. One could also add
detection of context where the user only needs true/not true handling
to speed up things; The problem is that the SQL language is not rich
enough to be able to optimize all cases :(
Benjamin> Well, if you can optimize them away in WHERE clauses, the main speed
Benjamin> problem should be solved. (I know, if a variable is assigned in a
Benjamin> WHERE clause, it is a problem again).
Yes, it can be solved for most cases until we introduce sub selects
and then it goes much worse again.