From: Rick James Date: July 13 2012 4:35pm Subject: RE: Composite Index Usage in Joins List-Archive: http://lists.mysql.com/mysql/227817 Message-Id: <2E7DD7ADE53B044C8C8BCD9C5829E1EB148892B610@SP2-EX07VS01.ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable > trade_time IN ('2012-07-01', '2012-07-02'); represents two distinct points in a continuum of a DATETIME. Perhaps you s= hould change to DATE. Even then, it would probably be treated as IN, not R= ANGE. Perhaps you really want BETWEEN (a "range") instead of IN (a set). What version are you using? Older version have essentially no optimization= s for IN(). Shrinking the table size may help -- Do you really need BIGINT (8 bytes) in= stead of INT UNSIGNED (4 bytes)? DATETIME (8) vs DATE(3). Etc. If these are stock prices, how can `price` be INT? > -----Original Message----- > From: Jeffrey Grollo [mailto:grolloj@stripped] > Sent: Tuesday, July 10, 2012 2:50 PM > To: mysql@stripped > Subject: Composite Index Usage in Joins >=20 > Hi, >=20 > I'm attempting to optimize a join and am having a difficult time using > multiple columns from a composite index. The second column of the > composite key is being used when tested for equality, but not for IN or > BETWEEN criteria. >=20 > As an example, say that I'm searching two tables: portfolio and trades. > The portfolio table contains a list of security IDs. The trades table > keeps tracks of the price and time when I've traded securities in my > portfolio. > Tables are: >=20 > CREATE TABLE portfolio ( > sec_id bigint(20) NOT NULL AUTO_INCREMENT, > name char(10) NOT NULL, > PRIMARY KEY (sec_id) > ) ENGINE=3DInnoDB ; >=20 > CREATE TABLE trades ( > tx_id bigint(20) NOT NULL AUTO_INCREMENT, > sec_id bigint(20) NOT NULL, > trade_time datetime NOT NULL, > price int NOT NULL, > PRIMARY KEY (tx_id), > KEY sec_time (sec_id, trade_time) > ) ENGINE=3DInnoDB ; >=20 > If I query the trades table directly both columns of the composite > index "sec_time" will be used when I'm using a range criteria on the > trade_time > column: >=20 >=20 > mysql> explain select price from trades force index(sec_time) > -> where sec_id IN (1, 2) and trade_time BETWEEN '2012-07-01' AND > '2012-07-04'; > +----+-------------+--------+-------+---------------+----------+------- > --+------+------+-------------+ > | id | select_type | table | type | possible_keys | key | > key_len | > ref | rows | Extra | > +----+-------------+--------+-------+---------------+----------+------- > --+------+------+-------------+ > | 1 | SIMPLE | trades | range | sec_time | sec_time | 16 > | > NULL | 2 | Using where | > +----+-------------+--------+-------+---------------+----------+------- > --+------+------+-------------+ >=20 >=20 > If I introduce a join to retrieve all trades for my portfolio, the > entire index will continue to be used if I make trade_time a constant: >=20 > mysql> explain select price from portfolio p inner join trades t force > index(sec_time) on p.sec_id =3D t.sec_id > -> where trade_time =3D '2012-07-01'; > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------------+------+-------------+ > | id | select_type | table | type | possible_keys | key | key_len > | > ref | rows | Extra | > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------------+------+-------------+ > | 1 | SIMPLE | p | index | PRIMARY | PRIMARY | 8 > | > NULL | 1 | Using index | > | 1 | SIMPLE | t | ref | sec_time | sec_time | 16 > | > vantage.p.sec_id,const | 1 | | > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------------+------+-------------+ >=20 >=20 > However, if I expand the trade_time search (either using IN or > BETWEEN), only the sec_id column of the composite query is used: >=20 >=20 > mysql> explain select price from portfolio p inner join trades t force > index(sec_time) on p.sec_id =3D t.sec_id > -> where trade_time IN ('2012-07-01', '2012-07-02'); > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------+------+-------------+ > | id | select_type | table | type | possible_keys | key | key_len > | > ref | rows | Extra | > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------+------+-------------+ > | 1 | SIMPLE | p | index | PRIMARY | PRIMARY | 8 > | > NULL | 1 | Using index | > | 1 | SIMPLE | t | ref | sec_time | sec_time | 8 > | > vantage.p.sec_id | 1 | Using where | > +----+-------------+-------+-------+---------------+----------+-------- > -+------------------+------+-------------+ >=20 >=20 > My expectation is that MySQL would be able to use both columns of the > sec_time index, but I've been unable to find either confirmation of > refutation of that assumption. If MySQL cannot optimize a join in this > case, is there another approach to optimizing this query that I should > pursue? Essentially, my trades table may contain many historical > records and pulling the entire history of trades for each security > would produce a much larger result set than would be retrieved if the > trade_time criteria was applied in the index reference. >=20 > I'm using MySQL 5.5.11. >=20 > Thanks for any guidance, > Jeff