From: Jorgen Loland Date: October 28 2010 12:49pm Subject: Re: bzr commit into mysql-next-mr-bugfixing branch (Olav.Sandstaa:3264) Bug#52660 List-Archive: http://lists.mysql.com/commits/122198 Message-Id: <4CC97173.8070706@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Hi Olav, Apart from Tor's request for documentation, the solution looks good. However, I think this bug can be solved by checking the range after all. The question is whether or not we want to invest time in fixing it since this is a fix for a non-prioritized SE. I have an outline for a solution: In mi_rkey(), inside the while() that checks the index condition, the newly found index record is stored in info->lastkey. Further, the end_range is stored in the ha_myisam object. You can get it from here ((ha_myisam*)info->index_cond_func_arg)->end_range->key These two keys can be fed to my_handler's ha_key_cmp(), something like this: if (((ha_myisam*)info->index_cond_func_arg)->end_range) { int cmp_res= 0; uchar *end= ((ha_myisam*)info->index_cond_func_arg)->end_range->key; cmp_res= ha_key_cmp(keyinfo->seg, key_buff, end, use_key_length, SEARCH_FIND, not_used); if (cmp_res > 0) { res= break; } The problem is that mi_rkey.c does not include ha_myisam.h, so we'll need to connect these somehow. This can, e.g., be done by having a function pointer (as done with index_cond_func) or the end_range key in the info object. On 10/19/2010 02:55 PM, Olav.Sandstaa@stripped wrote: > #At file:///home/olav/mysql/develop2/opt-bug52660/ based on revid:roy.lyseng@stripped > > 3264 Olav.Sandstaa@stripped 2010-10-19 > Fix for Bug#52660 Perf. regr. using ICP for MyISAM on range queries on an index containing TEXT > > When an index condition is pushed down to MyISAM the content of (most > of) the index entry will be read into the record buffer. This will be > used for evaluating the index condition and the range check in case of > a range scan. When reading the content of the index entry all fields > will be read with the exception of BLOB fields. This is not a problem > for evaluating the index condition itself since the server will not > push down an index condition that contains a BLOB field. But it is a > problem for the range evaluation that is performed as part of the ICP > implementation in MyISAM since this might be done using a "non-read" > BLOB value. > > A simplified version of the query given in the bug report: > > SELECT c1 FROM t3 WHERE c1>= 'c-1004=w' and c1<= 'c-1006=w' and i1> 2; > > it will be done as a range scan and following happens : > > 1. MyISAM locates the first record satisfying the lower range and this > record will be read into the record buffer. So after this the > content of the record buffer for the BLOB field will have the value: > > "c-1004=w" > > 2. When the server requests the next record in the range MyISAM will > due to ICP being used read in the index entry (but not including the > BLOB field), so when the end of range check is done it will evaluate > the upper range value (c1<= 'c-1006=w') against the value of the > first record read (ie. it will evaluate to true) and since the ICP > condition (i1> 1) evaluates to false it will skip this record and go > to the next record. This creates a loop that instead of stopping at > the upper range value will loop until it has read the complete index > (or until the ICP condition evaluates to true). > > So in the example in the bug report about 10 million index entries > will be read and evaluated instead of just 3 or 4. > > The fix for this is to let MyISAM reject pushed index conditions > whenever the index used for ICP contains a BLOB field. > > Note that this change disables ICP from being used by MyISAM in some > cases where it actually was safe to use it. The pushed index condition > will be reject in all cases where the index contains a BLOB field also > when the BLOB field is not used in any range evaluation. The reason for > this is at the time the server pushes the index condition MyISAM does > not know whether the actually data access will be a range scan with an > upper range value or not. > > The patch contains a simplified version of the original test > case. This test case only contains 100 records which is too little to > show any performance regression. The important point with the test > case is that without the fix the explain will show "Using index > condition" while with the fix it will say "Using where". -- Jørgen Løland | Senior Software Engineer | +47 73842138 Oracle MySQL Trondheim, Norway