List:General Discussion« Previous MessageNext Message »
From:Joerg Bruehe Date:January 6 2010 10:27am
Subject:Re: Join with OR-condition and Indexes
View as plain text  
Tobias, all,


Tobias Schultze wrote:
> Thanks for your answers.
> 
> I found out, that MySQL behaves very strange in this situation.
> I think this is a bug or important missing feature.

I disagree.

> 
> I would like to see how other DBMS behave in this situation, which I would
> think is a common problem - whenever you want to join one column of a table
> with several columns of another table.

Relational design theory will typically create a different schema than
you did, I don't think the "join with several columns" is such a common
task.

> 
> MySQL uses an index_merge when I query for one specific athlete:
> 
> [[...]]
> 
> But it doesn't use the index merge when joining like I originally tried,
> although logically it should be able to use it, shouldn't it?

The general problem with using an index is that it means you need to do
two logical accesses for one row, first descend the index hierarchy and
then follow the pointer into the data.
This is very efficient if you have few hits only, but it becomes
inefficient when there are many hits.

Example: In a table of all members of a typical army, it is inefficient
to use an index on a field "sex" to find all male members - the hit rate
is so high that a scan of the base table is less effort.
I know of a system where the optimizer uses a threshold of 15%: If the
expected number of hits in the index exceeds 15% of the table size, it
will not use the index but do a base table scan.

In your case, the system would need to search 4 indexes or scan 1 base
table - it doesn't surprise me that the optimizer considers the base
table scan to be more efficient.

> [[...]]
> 
> Then Michael Dykman said, MySQL is restricted to one index per table per
> query. So I thought, maybe I can help MySQL when I add a compound index for
> all players. So it could use the one index to resolve the join.
> ALTER TABLE `matches` ADD INDEX `players_idx` ( `team1_player_id` ,
> `team1_partner_id` , `team2_player_id` , `team2_partner_id` ) ;

Such a compound index cannot help you in your task:
An index can only be used if the value(s) for the leading column(s) (=
the one(s) with highest order in sorting) is/are given.
You cannot use such a compund index to search for "team1_partner_id" or
any of the "team2_*" values - or did you ever successfully use the order
a phone book is sorted in (last name, first name) to search by first
name only?

> 
> [[...]]
> 
> Changing the database schema doesn't seem to be usefull. Regarding my schema
> is already normalized and over-normalization generally decreases
> performances. I guess I would then run into other problems.

With the task you mailed about, all (up to) four (4) participants are
considered equal for the join. In light of this task, I would not call
your schema "normalized".

There may be other tasks in your system where the participants are not
considered equal, so your schema must consider all of them, possibly
weighted by the frequency of the tasks.


Regards,
Jörg

-- 
Joerg Bruehe,  MySQL Build Team,  Joerg.Bruehe@stripped
Sun Microsystems GmbH,   Komturstraße 18a,   D-12099 Berlin
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering     Muenchen: HRB161028

Thread
Join with OR-condition and IndexesTobias Schultze4 Jan
  • Re: Join with OR-condition and IndexesShawn Green5 Jan
    • Re: Join with OR-condition and IndexesMichael Dykman5 Jan
      • Re: Join with OR-condition and IndexesTobias Schultze6 Jan
        • Re: Join with OR-condition and IndexesJoerg Bruehe6 Jan