I also find that if I have both tables in MyISAM and use STRAIGHT_JOIN to
force the better query plan (enumerate the longer table, for each longer
table row use the shorter table's index to pick out the one right matching
row from the shorter table) then the server has low I/O utilization but
the CPU utilization is about as high as can be expected for a single query
running on a 16-CPU machine. Why should this thing be CPU-bound? Here is
the query:
create table fp2 (p VARCHAR(200) NOT NULL,
rcd DATETIME NOT NULL, rcms SMALLINT NOT NULL, msgid BIGINT NOT
NULL,
q VARCHAR(200) NOT NULL, scd DATETIME NOT NULL, scms SMALLINT NOT
NULL,
lat DECIMAL(14,3),
INDEX p(p), INDEX q(q) )
AS SELECT fldrcv.p, fldrcv.cd as rcd, fldrcv.cms as rcms,
fldrcv.msgid as msgid, fldrcv.q, fldsnd.cd as scd, fldsnd.cms as
scms,
TIMESTAMPDIFF(SECOND, fldsnd.cd, fldrcv.cd) +
(fldrcv.cms-fldsnd.cms)/1000 as lat
FROM fldrcv STRAIGHT_JOIN fldsnd
ON fldrcv.q=fldsnd.p AND fldrcv.qboot=fldsnd.pboot
AND fldrcv.msgid=fldsnd.msgid;
and here is some `iostat -x 5` output that shows a total of less than 50%
I/O utilization and about 15/16 CPU utilization:
avg-cpu: %user %nice %system %iowait %steal %idle
4.27 0.00 1.82 0.00 0.03 93.89
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 1.20 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 1.20 0.00 4.60 3.83
0.00 0.00 0.00 0.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdg 0.00 0.00 4.40 3.20 2252.80 1434.00 485.11
0.16 20.74 13.26 10.08
sdh 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00
0.13 18.44 12.89 9.28
sdi 0.00 0.00 4.20 2.80 2150.40 1433.60 512.00
0.13 19.20 12.91 9.04
sdj 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00
0.16 22.44 15.56 11.20
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
avg-cpu: %user %nice %system %iowait %steal %idle
4.28 0.00 1.81 0.01 0.03 93.88
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz
avgqu-sz await svctm %util
sda 0.00 0.00 0.00 0.40 0.00 12.80 32.00
0.00 4.00 4.00 0.16
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sde 0.00 0.00 0.00 0.60 0.00 3.80 6.33
0.00 0.00 0.00 0.00
sdf 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
sdg 0.00 0.00 4.40 3.00 2252.80 1433.80 498.19
0.17 23.57 16.65 12.32
sdh 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00
0.16 21.67 14.78 10.64
sdi 0.00 0.00 4.40 2.80 2252.80 1433.60 512.00
0.15 20.89 14.44 10.40
sdj 0.00 0.00 4.20 2.80 2150.40 1433.60 512.00
0.15 21.71 14.74 10.32
dm-0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00
Thanks,
Mike Spreitzer
From: Mike Spreitzer/Watson/IBM@IBMUS
To: Dan Nelson <dnelson@stripped>
Cc: MySql <mysql@stripped>
Date: 08/11/2010 01:30 PM
Subject: Re: idle query
I finally started trying to optimize along the memory-based lines you
suggested. I am surprised to find that the query plan is to enumerate the
memory-based table and then pick out the hundreds of related rows from the
much larger MyISAM table. What's going on here?
`show create table` says this about the relevant tables:
CREATE TABLE `fldsndm` (
`p` varchar(200) NOT NULL,
`cd` datetime NOT NULL,
`cms` smallint(6) NOT NULL,
`pip` char(15) NOT NULL,
`pport` smallint(6) NOT NULL,
`pboot` bigint(20) NOT NULL,
`msgid` bigint(20) NOT NULL,
`startgtime` bigint(20) NOT NULL,
`datalen` int(11) NOT NULL,
`toself` tinyint(1) DEFAULT NULL,
`sepoch` bigint(20) NOT NULL DEFAULT '0',
`c` decimal(11,3) NOT NULL DEFAULT '0.000',
UNIQUE KEY `pbm` (`p`,`pboot`,`msgid`)
) ENGINE=MEMORY DEFAULT CHARSET=latin1
CREATE TABLE `fldrcv` (
`p` varchar(200) NOT NULL,
`cd` datetime NOT NULL,
`cms` smallint(6) NOT NULL,
`pip` char(15) NOT NULL,
`pport` smallint(6) NOT NULL,
`pboot` bigint(20) DEFAULT NULL,
`qip` char(15) NOT NULL,
`qport` smallint(6) NOT NULL,
`qboot` bigint(20) DEFAULT NULL,
`msgid` bigint(20) NOT NULL,
`startgtime` bigint(20) NOT NULL,
`datalen` int(11) NOT NULL,
`q` varchar(200) DEFAULT NULL,
`repoch` bigint(20) NOT NULL DEFAULT '0',
`c` decimal(11,3) NOT NULL DEFAULT '0.000',
KEY `c` (`c`),
KEY `pec` (`p`,`repoch`,`c`),
KEY `peqms` (`p`,`repoch`,`q`,`msgid`,`startgtime`),
KEY `qbm` (`q`,`qboot`,`msgid`),
KEY `pbm` (`p`,`pboot`,`msgid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
And here is the query planning I see:
mysql> explain select * from fldrcv join fldsndm on fldrcv.q=fldsndm.p AND
fldrcv.qboot=fldsndm.pboot and fldrcv.msgid=fldsndm.msgid;
+----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref
| rows | Extra |
+----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+
| 1 | SIMPLE | fldsndm | ALL | pbm | NULL | NULL |
NULL | 29036 | |
| 1 | SIMPLE | fldrcv | ref | qbm | qbm | 220 |
bigCell2906_flood.fldsndm.p,bigCell2906_flood.fldsndm.pboot,bigCell2906_flood.fldsndm.msgid
| 452 | Using where |
+----+-------------+---------+------+---------------+------+---------+---------------------------------------------------------------------------------------------+-------+-------------+
BTW, here are the table sizes:
mysql> select count(*) from fldrcv;
+----------+
| count(*) |
+----------+
| 13785373 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*) from fldsndm;
+----------+
| count(*) |
+----------+
| 29036 |
+----------+
Thanks,
Mike Spreitzer