Well, if nothing can ever equal null, then why isn't MySQL query parser
smart enough to reduce my queries to something more sensible? If I'm
saying this:
SELECT *
FROM sometable
WHERE somecolumn = NULL OR somecolumn = 'abc';
Why isn't it able to reduce the query to something more like this:
SELECT *
FROM sometable
WHERE somecolumn = 'abc';
Since it already should know that somecolumn = NULL will always evaluate
to FALSE (or is it NULL? ... either way, it's not "TRUE")? If I run the
first query above, the query takes about 15 seconds to run against 40
million records, but if I run the second query, it takes about .050
seconds. The test for NULL seems to cause the query to skip use of an
index because I doubt NULL values are indexed.
Am I expecting too much of the parser?
-- Dante
Martijn Tonies wrote:
> Hi,
>
>> Will anything ever be equal to NULL in a SELECT query?
>
> No, never.
>
> Null also means "unknown", if you design your tables well enough,
> there should be no NULLs -stored- (different from a resultset,
> where there can be nulls, for example in LEFT JOINs), because it's no
> use to store what you don't know. The only case when you want to store a
> null is when you do want to -know- you don't know a value.
>
> A column can have two states: null or not null. It either has
> data (a value, depending on the datatype), or no data (null),
> which is where <column> IS NULL (has no data) or
> <column> IS NOT NULL (has data) comes into play.
>
> Null is not the same as empty. An empty string, for example, is not
> equal to null (which is "unknown"), you cannot compare
> anything to what you don't know, which is why your comparison
> fails.
>
> NULL = NULL fails, so does NULL <> NULL in the strict
> sense.
>
>
>> SELECT *
>> FROM sometable
>> WHERE somecolumn = NULL;
>>
>> I have a real-life query like this:
>>
>> SELECT *
>> FROM sometable
>> WHERE somecolumn = NULL OR somecolumn = 'abc';
>>
>> The 'sometable' contains about 40 million records and in this query,
>> it appears that the where clause is doing a sequential scan of the
>> table to find a condition where 'somecolumn' = NULL. Shouldn't the
>> query parser be smart enough to rewrite the above query like this:
>
> You mean the "optimizer", perhaps it should, perhaps it shouldn't. Yet,
> your query is not really the smartest, as you should avoid writing
> <column> = NULL, as this doesn't make sense.
>
>> SELECT *
>> FROM sometable
>> WHERE FALSE OR somecolumn = 'abc';
>>
>> And therefor use the index I have on 'somecolumn'? When I manually
>> rewrite the query, I get the performance I expect but when I leave it
>> as it was, it's 100 times slower.
>>
>> What's so special about NULL?
>
> Quite a bit ;-)
>
> With regards,
>
> Martijn Tonies
> Upscene Productions
> http://www.upscene.com
>
> Download Database Workbench for Oracle, MS SQL Server, Sybase SQL
> Anywhere, MySQL, InterBase, NexusDB and Firebird!
>
> Database questions? Check the forum:
> http://www.databasedevelopmentforum.com
>
--
----------
D. Dante Lorenso