From: Sergei Golubchik Date: October 10 2012 7:07pm Subject: Re: purpose of KEY->rec_per_key List-Archive: http://lists.mysql.com/internals/38604 Message-Id: <20121010190755.GA18778@meddwl.fritz.box> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Hi, Zardosht! On Oct 10, Zardosht Kasheff wrote: > > In a the structure KEY, I see a field named rec_per_key that has the > following comment: > > /* > Array of AVG(#records with the same field value) for 1st ... Nth key part. > 0 means 'not known'. > For temporary heap tables this member is NULL. > */ > > Our storage engine does not ever fill these values. We set the values > to 0. We recently saw that not setting this for the primary key > resulted in a bad query plan. We can set it to 1 for unique keys and > the primary key. Are there other known bad side effects to not setting > this? Other than bad query plans? No. These values are used by the optimizer to prefer an index with the best selectivity. E.g, in SELECT ... WHERE a=5 AND b=6, if the optimizer has to choose between an index on 'a', with rec_per_key=2, and an index on 'b' with rec_per_key=20, it would prefer an index on 'a'. Because it would expect that a=5 will match only 2 rows, while b=6 will match 20 rows. This was very simplified explanation. Really optimizer would probably use records_in_range in that case. But if you have a join, SELECT ... t1, t2 WHERE t1.a=t2.b AND ... then optimizer would use rec_per_key. These rec_per_key values don't have to be exact, MyISAM updates them only on ANALYZE TABLE statement. But practically they change very slowly, even if the table is constantly updated, so it's ok to update them rarely. Regards, Sergei