From: Jorgen Loland Date: February 1 2011 3:40pm Subject: Re: bzr commit into mysql-trunk branch (jorgen.loland:3527) Bug#59415 List-Archive: http://lists.mysql.com/commits/130144 Message-Id: <4D482971.8020809@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hi Guilhem, On 01/31/2011 09:23 PM, Guilhem Bichot wrote: > Hello Jorgen, > > Jorgen Loland a écrit, Le 18.01.2011 10:07: >> #At file:///export/home/jl208045/mysql/mysql-trunk-59415/ based on >> revid:alexander.barkov@stripped >> >> 3527 Jorgen Loland 2011-01-18 >> BUG#59415: Range analysis should not be done many times for the >> same index >> The optimizer often choses to access a table through an index that >> does not provide the correct ordering for free. To remedy this, the >> function test_if_skip_sort_order() is called to see if another index >> is as good as the chosen index and at the same time is able to provide >> ordering. >> This implies that test_if_skip_sort_ordering() goes through a full >> range analysis (if range access is applicable) to check whether or not >> another range access plan should be used instead of the currently >> chosen ref/range access method. >> The problem is that if range analysis is performed and it is decided >> that it is not better than whatever we had, the range analysis will >> most likely be performed again and again with the same outcome because >> test_if_skip_sort_order() is called from multiple locations. >> This patch avoids the unnecessarily repeated range analysis >> described above by introducing key_map JOIN_TAB::quick_order_tested >> which is checked to see >> if range analysis has already been performed for a given key. > > Good that this is being fixed. It will reduce the size of certain > optimizer traces :-) And hopefully reduce time spent in the optimizer for some queries as well... :-) >> === modified file 'sql/sql_select.h' >> --- a/sql/sql_select.h 2011-01-14 08:20:08 +0000 >> +++ b/sql/sql_select.h 2011-01-18 09:07:10 +0000 >> @@ -279,6 +279,15 @@ typedef struct st_join_table : public Sq >> key_map checked_keys; /**< Keys checked */ >> key_map needed_reg; >> key_map keys; /**< all keys with can be used */ >> + /** >> + Used to avoid repeated range analysis for the same key in >> + test_if_skip_sort_order(). This would otherwise happen if the best >> + range access plan found for a key is turned down. >> + quick_order_tested is cleared every time the select condition for >> + this jointab changes since a new condition may give another plan > > jointab -> JOIN_TAB ? > >> + and cost from range analysis. >> + */ >> + key_map quick_order_tested; > > 1) do you have an idea why we repeat calls to test_if_skip_sort_order()? To some extent. It is typically first called to check if ORDER BY can be resolved by using an index. Later, it may be called for various reasons. Consider this query: CREATE TABLE t1 ( i1 int, i2 int, c char(1), KEY k1 (i1), KEY k2 (i1, i2) ); INSERT INTO t1 VALUES (0,1,'2'),(3,2,'1'); EXPLAIN SELECT DISTINCT i1 FROM t1 WHERE i1 >= '1' ORDER BY i1 DESC; In this query we reduce the number of times range analysis is called inside test_if_skip... from 4 to one: 1) Test if index can be used to resolve ORDER BY (=> range is not useful) 2) Distinct is transformed into group by: can index be used to resolve this group by? (=> range is not useful) 3) After adding conditions for a const reference to the table (=> range is not useful) [1] 4) Right before printing explain output to check if ordering was optimized away. (=> range is not useful) 5) (When EXPLAIN is removed from query above) In create_sort_index(). Unsure why we do this :-/ (=> range is not useful) [1] Here, around line 2430, in JOIN::optimize() if (!(select_options & SELECT_BIG_RESULT) && ((group_list && (!simple_group || !test_if_skip_sort_order(&join_tab[const_tables], group_list, unit->select_limit_cnt, 0, &join_tab[const_tables].table-> keys_in_use_for_group_by))) || > > 2) are we sure that the repeated calls to test_if_skip_sort_order() > operate on the same input data? if not, there is no guarantee that the > result of the Nth call is like the result of the first call, and so it's > not possible to skip the Nth call; The arguments used by test_if_skip_sort_order() to call test_quick_select() do not change between the calls (the index to analyze may in theory be different between calls to test_if_skip..., but in that case we don't skip range analysis). Btw, I haven't seen any examples where the index to analyze has changed. Range analysis bases the cost on info from the handler: head->file->stats.* (static data AFAICT), and read-cost estimates from head->file->scan_time() etc. These are also based on static data from stats. Apart from that, a bunch of constants are used (e.g. TIME_FOR_COMPARE,...). Finally, the conditions attached to the table are vital. But the code is written to handle that. All in all this is taken care of AFAICT. It doesn't come with a mathematical proof or anything, but at least I can't find weak spots. > 3) JOIN_TAB::select_cond is a public member; so it can be changed > directly without calling set_select_cond(); so if someone accidentally > does that, quick_order_tested won't be cleared and there will be a bug. > Grepping for "select_cond.*=" this doesn't seem to be the case now, but > to protect the future, maybe we should make select_cond private, and > introduce a getter for it? Good point. New patch coming up. -- Jørgen Løland | Senior Software Engineer | +47 73842138 Oracle MySQL Trondheim, Norway