From: Paul Cadach Date: October 13 2000 11:10am Subject: Re: Does someone forget record_count? List-Archive: http://lists.mysql.com/internals/30 Message-Id: <001b01c03506$2720a2a0$0900a8c0@paul.asu.odt.pvt> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0018_01C03538.6EA7B820" ------=_NextPart_000_0018_01C03538.6EA7B820 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi, -----Original Message----- From: Michael Widenius To: Paul Cadach Date: Friday, October 13, 2000 3:31 PM Subject: Does someone forget record_count? >>>>>> "Paul" =3D=3D Paul Cadach writes: > >Paul> Hi, >Paul> It's looks like someone forget to multiply s->found_records on = record_count >Paul> at line 1739 (or around) of sql_select.cc: >Paul> if (s->on_expr) >Paul> { >Paul> = tmp=3Drecord_count*min(s->found_records,s->worst_seeks); // Can't = use read cache >Paul> } > >tmp should at the above place be the number of seeks you have to do >find the rows for this level. > >s->found_rows should either be the number of rows found by an index >which we compare with a constants or the number of rows in the table. > >Why do you think it's wrong? Can you give us an example where tmp >gets an unreasonable value ? If it's not wrong so a bug are hidden in SELECT::test_quick_select() = because its returns invalid rows numbers. Also, everywhere before this line tmp is multiplied to record_count when = it lets some number of rows:=20 if (table->used_keys & ((key_map) 1 << key)) { /* we can use only index tree */ uint keys_per_block=3D table->file->block_size/2/ keyinfo->key_length+1; = tmp=3Drecord_count*(tmp+keys_per_block-1)/keys_per_block; } else tmp=3Drecord_count*min(tmp,s->worst_seeks); Back to the story, the situation is next. -------- optimized query -------- mysql> explain select s.stncode,a.created from station s left join = alarms a -> on a.stncode=3Ds.stncode and a.strattr=3D'T' where (s.stncode is = null or -> s.stncode is not null) and (a.stncode is null or a.stncode is not = null) -> group by s.stncode; +-------+-------+---------------+---------+---------+-----------------+--= ----+-------------------------+ | table | type | possible_keys | key | key_len | ref | = rows | Extra | +-------+-------+---------------+---------+---------+-----------------+--= ----+-------------------------+ | s | index | PRIMARY | PRIMARY | 17 | NULL | = 9 | where used; Using index | | a | ref | i3,i1 | i3 | 11 | const,s.stncode | = 45 | where used | +-------+-------+---------------+---------+---------+-----------------+--= ----+-------------------------+ 2 rows in set (0.33 sec) -------- optimized query -------- -------- not optimized query -------- mysql> explain select s.stncode,a.created from station s left join = alarms a -> on a.stncode=3Ds.stncode and a.strattr=3D'T' group by s.stncode; +-------+------+---------------+------+---------+------+------+----------= -------+ | table | type | possible_keys | key | key_len | ref | rows | Extra = | +-------+------+---------------+------+---------+------+------+----------= -------+ | s | ALL | NULL | NULL | NULL | NULL | 9 | Using = temporary | | a | ALL | i3,i1 | NULL | NULL | NULL | 2 | = | +-------+------+---------------+------+---------+------+------+----------= -------+ 2 rows in set (0.00 sec) -------- not optimized query -------- Second query processes slowly while relation between tables s and a = still exist. Playing with this type of queries points to me that sometime = adding "dummy" condition (like "a.stncode is null or a.stncode is not null") to where clause improves results, but not every time. As you can see, in last (non-optimized query) MySQL detects invalid = quick rows count. When it tries to use key i3 (which must be used = because 9 rows only (grouped into 5 diffirent rows) will be returned by = this query from about 33000 rows in total at alarms table), it=20 at next place have values: if (best =3D=3D DBL_MAX || (tmp + record_count/(double) = TIME_FOR_COMPARE*s->found_records < best + record_count/(double) TIME_FOR_COMPARE*records)) record_count =3D 9 (given as argument to find_best() from query = estimation on first table) best =3D 18 (sets as record_count * 2, sorry, I don't remember where its = gets '2' value) tmp =3D 2 (sets from s->found_records) So, previous condition will be true, and reset key usage by setting = best_key to 0. Because alarms table is too big (about 33000 rows) it will generate huge = temporary table (about 9 * 33000 records) using full join instead of = fast key lookups to find only 5 records in alarms table and process = about 45 (9 * 5) records. Multiplying s->found_records to record_count allows to be tmp in the = same range as other tmps calculated when left join is not used. >PS: The right place to post questions like the above is > internals@stripped ; All MySQL developers is reading this > list so you will get much more response by posting to this list > instead of directly to me. I tried to write to developers@stripped but not get any solution. BTW, how to subscribe internals' list and does MySQL have some = description of its internals? WBR, Paul. --- Paul Cadach, Leading Software Engineer IT Department East-Kazakhstan Regional Board of Telecommunications ------=_NextPart_000_0018_01C03538.6EA7B820--