From: Zardosht Kasheff Date: June 23 2009 6:10pm Subject: help with index_merge and clustering keys List-Archive: http://lists.mysql.com/internals/36977 Message-Id: <2f9663ba0906231110w28d03749ie6f34bfdd472d94a@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Hello, Our storage engine, TokuDB, allows multiple keys to be clustering (include all columns in the index). For context, an short explanation of the feature is here: http://blogs.tokutek.com/tokuview/introducing_multiple_clustering_indexes To get the optimizer to properly know when to use a clustering key, we reimplemented handler::read_time to return a better estimate in the case of clustering keys (similar to what InnoDB does in ha_innobase::read_time for the primary key). This helps the optimizer in most cases. The problem I have now run into is getting index_merge to work properly. Because MySQL does not know that an arbitrary key can be clustering, it will proceed to use it like a normal key, and this can lead to a sub-optimal query plan. Suppose we have the following query: select * from foo where a=1 and b=1; If we have a normal secondary key on "a" and a key on "b", then MySQL gets the values from the keys, does an intersect, and then uses handler::rnd_pos to retrieve the entire row. This is the right thing to do here. Unfortunately, at the moment, if we have clustering keys on "a" and "b", it does the same thing. This is not the ideal behavior. We want the optimizer to select either the key on "a" or "b" (suppose we select "a"), retrieve all of the rows where "a = 1", and from this result set, filter out the rows where "b != 1". This is what the optimizer would do if "a" were the primary key. Does anyone have any idea how one could get this to properly? Thanks -Zardosht