List:Database Benchmarks« Previous MessageNext Message »
From:Peter Zaitsev Date:November 3 2003 9:18pm
Subject:Re: use case: resolve references to unique tuples
View as plain text  
On Sun, 2003-11-02 at 10:39, Eric Prud'hommeaux wrote:

Eric,

Thank your for your investigations and feedback.

Comments below.

> This message includes data, a query, and a suggestion to handle
> resolution of uniquely constrained tuples as a special (early) case.
> Attachments are referened by [attachement name].

Actually I shall note MySQL already does it. It is just what in
different case:

As you see if  row is referenced by constant unique ID it is processed
separately  - pre-read at the very early stage of query execution.


mysql> alter table __Nodes__ add unique (hash);
Query OK, 64 rows affected (0.00 sec)
Records: 64  Duplicates: 0  Warnings: 0



mysql> explain SELECT __Holds___0.id AS __Holds___0_annotation_id,
    ->        __Holds___1.id AS __Holds___1_annotation_id,
    ->        __Holds___2.id AS __Holds___2_annotation_id,
    ->        __Holds___3.id AS __Holds___3_annotation_id,
    ->        __Holds___4.id AS __Holds___4_body_id
    -> FROM __Holds__ AS __Holds___0
    ->      INNER JOIN __Nodes__ AS __Nodes___0 ON
__Holds___0.p=__Nodes___0.node
    ->      INNER JOIN __Nodes__ AS __Nodes___2 ON
__Holds___0.o=__Nodes___2.node
    ->      INNER JOIN __Holds__ AS __Holds___1 ON
__Holds___1.s=__Holds___0.s
    ->      INNER JOIN __Nodes__ AS __Nodes___3 ON
__Holds___1.p=__Nodes___3.node
    ->      INNER JOIN __Holds__ AS __Holds___2 ON
__Holds___2.s=__Holds___0.s
    ->      INNER JOIN __Nodes__ AS __Nodes___5 ON
__Holds___2.p=__Nodes___5.node
    ->      INNER JOIN __Holds__ AS __Holds___3 ON
__Holds___3.s=__Holds___0.s
    ->      INNER JOIN __Nodes__ AS __Nodes___7 ON
__Holds___3.p=__Nodes___7.node
    ->      INNER JOIN __Holds__ AS __Holds___4 ON
__Holds___4.s=__Holds___3.o
    ->      INNER JOIN __Nodes__ AS __Nodes___9 ON
__Holds___4.p=__Nodes___9.node
    -> WHERE __Nodes___0.hash="c74e2b735dd8dc85ad0ee3510c33925f"
    ->   AND __Nodes___2.hash="acef97e962dd06905694368feda0317f"
    ->   AND __Nodes___3.hash="5a57811b2e27c10052c3916f8c1d7415"
    ->   AND __Nodes___5.hash="ddb55f1d54598bb5120e03548d3b1dfa"
    ->   AND __Nodes___7.hash="93911c4322d01104748a861871acb21c"
    ->   AND __Nodes___9.hash="c8b099193cc043d0f51b47d65c24300b";
+-------------+-------+--------------------------+------+---------+-------+------+-------------+
| table       | type  | possible_keys            | key  | key_len |
ref   | rows | Extra       |
+-------------+-------+--------------------------+------+---------+-------+------+-------------+
| __Nodes___0 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Nodes___2 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Nodes___3 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Nodes___5 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Nodes___7 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Nodes___9 | const | u_node_nodec,u_node,hash | hash |      32 |
const |    1 |             |
| __Holds___0 | ref   | u_                       | u_   |       4 |
const |   11 | Using where |
| __Holds___1 | ref   | u_                       | u_   |       4 |
const |   11 | Using where |
| __Holds___2 | ref   | u_                       | u_   |       4 |
const |   11 | Using where |
| __Holds___3 | ref   | u_                       | u_   |       4 |
const |   11 | Using where |
| __Holds___4 | ref   | u_                       | u_   |       4 |
const |   11 | Using where |
+-------------+-------+--------------------------+------+---------+-------+------+-------------+
11 rows in set (0.00 sec)

> 
> I had this (albeit preverse from many perspectives) query [slowQuery] that was
> running kinda slow on some dB [db]
> 2 rows in set (0.96 sec)
> 2 rows in set (0.95 sec)

It would be also good to take a look at speed of EXPLAIN of this query -
most of the time is spent doing this query optimization. This is because
MySQL has a lot of join  sequence possibilities to analyze which results
in such poor performance. 

You can try for example using STRAIGHT_JOIN to force joining tables in
specific order. This would make optimizer not to search for best join
plan but use the one provided by user.

We're now working on improving optimizer to handle joins with large
amount of tables more efficiently. 


> 
> so I looked at it and figured I'd do an extra query up front to gather
> some unique values and eliminate a bunch of joins.
> [fastQuery]
> 2 rows in set (0.00 sec)
> 2 rows in set (0.00 sec)
> 
> That seemed to show some improvement so I figured I'd tell the database
> what I knew so it could do that step for me.  I expected that a unique
> on __Nodes__.node would allow the database to extract the key 
> for all the joins that constrained to a single tuple by providing a
> value for a UNIQUE.
> 
> mysql> ALTER TABLE __Nodes__ ADD UNIQUE u_node(node);
> Query OK, 64 rows affected (0.08 sec)
> Records: 64  Duplicates: 0  Warnings: 0
> 
> I tried the query again and saw about have the improvement I expected.
> 
> 2 rows in set (0.35 sec)
> 2 rows in set (0.32 sec)
> 
> I imagine this improvement was not because the DB did the same
> optimization that I did, but because of some more second nature
> artifact. The scenario is isolating tuples with a unique and making
> refereneces to the values in that tuple is fairly common. It may be
> worth putting a little time into this. I thought this use case might
> make a handy study and benchmark.

Yes, This provided more information to the optimizer and it started to
spend less time analysing possible join optimizations. 

I'm not sure which optimization are you referring to. Can you create
small (ie 2-3 tables) example and explain how would you like MySQL to
execute the query.


-- 
Peter Zaitsev, Full-Time Developer
MySQL AB, www.mysql.com

Are you MySQL certified?  www.mysql.com/certification

Thread
use case: resolve references to unique tuplesEric Prud'hommeaux2 Nov
  • Re: use case: resolve references to unique tuplesEric Prud'hommeaux2 Nov
  • Re: use case: resolve references to unique tuplesPeter Zaitsev3 Nov
    • Re: use case: resolve references to unique tuplesEric Prud'hommeaux3 Nov