Hi all,
I am playing around w/ the BKA optimization found in mysql 5.6. I'm having
an issue getting the expected increase in performance with this
optimization.
Specifically, I'm trying to optimize the following query (TPC-H Q10):
select
c_custkey,
c_name,
sum(l_extendedprice * (100 - l_discount)) as revenue,
c_acctbal,
n_name,
c_address,
c_phone,
c_comment
from CUSTOMER_INT, ORDERS_INT, LINEITEM_INT, NATION_INT
where
c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate >= date '1994-08-01'
and o_orderdate < date '1994-08-01' + interval '3' month
and l_returnflag = 'R' and c_nationkey = n_nationkey
group by c_custkey, c_name, c_acctbal, c_phone, n_name,
c_address, c_comment
order by revenue desc limit 20;
When running the query, iostat reports the disk reading at about 4MB/s,
which suggests lots of random seeking.
Looking at explain, I get:
mysql> explain select c_custkey, c_name, sum(l_extendedprice * (100 -
l_discount)) as revenue, c_acctbal, n_name, c_address, c_phone,
c_comment from CUSTOMER_INT, ORDERS_INT, LINEITEM_INT, NATION_INT
where c_custkey = o_custkey and l_orderkey = o_orderkey and
o_orderdate >= date '1994-08-01' and o_orderdate < date '1994-08-01' +
interval '3' month and l_returnflag = 'R' and c_nationkey = n_nationkey
group by c_custkey, c_name, c_acctbal, c_phone, n_name,
c_address, c_comment order by revenue desc limit 20;
+----+-------------+--------------+--------+-------------------+-----------+---------+-------------------------------------+---------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key
| key_len | ref | rows |
Extra |
+----+-------------+--------------+--------+-------------------+-----------+---------+-------------------------------------+---------+-----------------------------------------------------+
| 1 | SIMPLE | CUSTOMER_INT | ALL | PRIMARY | NULL
| NULL | NULL | 1501528 | Using
temporary; Using filesort |
| 1 | SIMPLE | NATION_INT | eq_ref | PRIMARY | PRIMARY
| 4 | tpch-10.00.CUSTOMER_INT.C_NATIONKEY | 1 |
NULL |
| 1 | SIMPLE | ORDERS_INT | ref | PRIMARY,O_CUSTKEY | O_CUSTKEY
| 4 | tpch-10.00.CUSTOMER_INT.C_CUSTKEY | 7 | Using where;
Using join buffer (Batched Key Access) |
| 1 | SIMPLE | LINEITEM_INT | ref | PRIMARY | PRIMARY
| 4 | tpch-10.00.ORDERS_INT.O_ORDERKEY | 1 | Using
where |
+----+-------------+--------------+--------+-------------------+-----------+---------+-------------------------------------+---------+-----------------------------------------------------+
4 rows in set (0.00 sec)
I'm wondering why, in this particular query, mysql doesn't use the BKA to
join the LINEITEM_INT table, but uses it for the ORDERS_INT table? It seems
like it should also use BKA to batch key fetches from the LINEITEM_INT
table (this I believe is the reason for the slow performance).
Relative mysql paramters:
--innodb-buffer-pool-size=2G
--optimizer_switch=mrr=on
--optimizer_switch=mrr_cost_based=off
--optimizer_switch=batched_key_access=on
--join_buffer_size=12M
--read_rnd_buffer_size=12M
Hopefully, I'm not missing anything too obvious here. Thanks!