Hi!
>>>>> "Eric" == Eric <ej@stripped> writes:
Eric> Sorry for my delayed response...have been communicating with the
Eric> customer. My situation has changed.
Eric> First let me acknowledge that creating the multi-part indexes did
Eric> speed up the queries I was running signifigantly. However, it did not
Eric> help MySQL get the join order correct. This is not a suprise to me,
Eric> as the only way MySQL could get this join right is to use metadata
Eric> based on the range distribution as you have described (and I have been
Eric> considering).
Good that things got better; I agree that having some more meta-data
would make MySQL even better for cases like yours.
Eric> The large change in my situation is that I've discovered that our
Eric> customer is indexing every column of the table individually, except
Eric> type. I believe I will suggest that they add the type column to their
Eric> index of the path column, but beyond that I don't believe there is any
Eric> further improvement I can make via indexes. There won't be a
Eric> signifigant speed improvement by combining any of their indexes into
Eric> multi-part indexes, correct?
It depends on the queries. If they are just doing one type of
queries, then there is probably not a big seed increase by combining
indexes.
Eric> So, the problem remains that even though I have every column indexed,
Eric> the estimates that MySQL does do not take into account the
Eric> distribution of values in those indexes and thus it does not optimize
Eric> the join order as I would like it to. Since I'm using the same keys
Eric> for each join, the joins end up in the order that I put them in the
Eric> SQL statement.
This shouldn't have to happen if you the WHERE part is different for
the involved tables. For example:
SELECT * from table as t1, table as t2 WHERE t1.id=t2.id and
t1.key=constant and t2.key=constant2;
In the above case MySQL can use the key distribution for 'key' to
decide which table it's better to first use in the join.
Eric> So is the correct thing to do still to modify OPTIMIZE and ANALYZE to
Eric> store distribution statistics and then modify get_quick_record_count()
Eric> to look at those for indexed (and non-indexed if it becomes necessary
Eric> / is easy) columns? Where are the OPTIMIZE and ANALYZE commands? I
Eric> would start this by simply storing it in memory after running OPTIMIZE
Eric> or ANALYZE and in the future I will add it to the MyISAM table format.
For MyISAM tables, you can find all necessary interace information in
the sql/ha_myisam.h and sql/ha_myisam.cc files.
I suggest you start by checking the code:
int ha_myisam::analyze(THD *thd, HA_CHECK_OPT* check_opt)
and the function myisam/mi_check.c
If you can get this to work by storing the keys in memory, we can help
you to store these on disk for permanent usage.
Regards,
Monty