List:Internals« Previous MessageNext Message »
From:Paul Cadach Date:October 13 2000 11:10am
Subject:Re: Does someone forget record_count?
View as plain text  
Hi,

-----Original Message-----
From: Michael Widenius <monty@stripped>
To: Paul Cadach <paul@stripped>
Date: Friday, October 13, 2000 3:31 PM
Subject: Does someone forget record_count?


>>>>>> "Paul" == Paul Cadach <paul@stripped> writes:
>
>Paul> Hi,
>Paul> It's looks like someone forget to multiply s->found_records on
> record_count
>Paul> at line 1739 (or around) of sql_select.cc:
>Paul>         if (s->on_expr)
>Paul>         {
>Paul>           tmp=record_count*min(s->found_records,s->worst_seeks);       
> // Can't use read cache
>Paul>         }
>
>tmp should at the above place be the number of seeks you have to do
>find the rows for this level.
>
>s->found_rows should either be the number of rows found by an index
>which we compare with a constants or the number of rows in the table.
>
>Why do you think it's wrong?  Can you give us an example where tmp
>gets an unreasonable value ?

If it's not wrong so a bug are hidden in SELECT::test_quick_select() because its returns
invalid rows numbers.

Also, everywhere before this line tmp is multiplied to record_count when it lets some
number of rows: 
              if (table->used_keys & ((key_map) 1 << key))
              {
                /* we can use only index tree */
                uint keys_per_block= table->file->block_size/2/
                  keyinfo->key_length+1;
                tmp=record_count*(tmp+keys_per_block-1)/keys_per_block;
              }
              else
                tmp=record_count*min(tmp,s->worst_seeks);


Back to the story, the situation is next.

-------- optimized query --------
mysql> explain select s.stncode,a.created from station s left join alarms a
    -> on a.stncode=s.stncode and a.strattr='T' where (s.stncode is null or
    -> s.stncode is not null) and (a.stncode is null or a.stncode is not null)
    -> group by s.stncode;
+-------+-------+---------------+---------+---------+-----------------+------+-------------------------+
| table | type  | possible_keys | key     | key_len | ref             | rows | Extra      
            |
+-------+-------+---------------+---------+---------+-----------------+------+-------------------------+
| s     | index | PRIMARY       | PRIMARY |      17 | NULL            |    9 | where used;
Using index |
| a     | ref   | i3,i1         | i3      |      11 | const,s.stncode |   45 | where used 
            |
+-------+-------+---------------+---------+---------+-----------------+------+-------------------------+
2 rows in set (0.33 sec)
-------- optimized query --------

-------- not optimized query --------
mysql> explain select s.stncode,a.created from station s left join alarms a
    -> on a.stncode=s.stncode and a.strattr='T' group by s.stncode;
+-------+------+---------------+------+---------+------+------+-----------------+
| table | type | possible_keys | key  | key_len | ref  | rows | Extra           |
+-------+------+---------------+------+---------+------+------+-----------------+
| s     | ALL  | NULL          | NULL |    NULL | NULL |    9 | Using temporary |
| a     | ALL  | i3,i1         | NULL |    NULL | NULL |    2 |                 |
+-------+------+---------------+------+---------+------+------+-----------------+
2 rows in set (0.00 sec)
-------- not optimized query --------

Second query processes slowly while relation between tables s and a still
exist. Playing with this type of queries points to me that sometime adding
"dummy" condition (like "a.stncode is null or a.stncode is not null") to
where clause improves results, but not every time.

As you can see, in last (non-optimized query) MySQL detects invalid quick rows count. When
it tries to use key i3 (which must be used because 9 rows only (grouped into 5 diffirent
rows) will be returned by this query from about 33000 rows in total at alarms table), it 
at next place have values:
        if (best == DBL_MAX ||
            (tmp  + record_count/(double) TIME_FOR_COMPARE*s->found_records <
             best + record_count/(double) TIME_FOR_COMPARE*records))
record_count = 9 (given as argument to find_best() from query estimation on first table)
best = 18 (sets as record_count * 2, sorry, I don't remember where its gets '2' value)
tmp = 2 (sets from s->found_records)

So, previous condition will be true, and reset key usage by setting best_key to 0.

Because alarms table is too big (about 33000 rows) it will generate huge temporary table
(about 9 * 33000 records) using full join instead of fast key lookups to find only 5
records in alarms table and process about 45 (9 * 5) records.


Multiplying s->found_records to record_count allows to be tmp in the same range as
other tmps calculated when left join is not used.

>PS: The right place to post questions like the above is
>    internals@stripped ;  All MySQL developers is reading this
>    list so you will get much more response by posting to this list
>    instead of directly to me.

I tried to write to developers@stripped but not get any solution.


BTW, how to subscribe internals' list and does MySQL have some description of its
internals?



WBR,
Paul.
---
Paul Cadach, Leading Software Engineer
IT Department
East-Kazakhstan Regional Board of Telecommunications


Thread
Re: Does someone forget record_count?Paul Cadach13 Oct
  • Re: Does someone forget record_count?Michael Widenius14 Oct