4473 Jan Wedvik 2011-08-22
This commit fixes (hopefully) compiler warnings in previous commit.
4472 Jan Wedvik 2011-08-22
This commit implements adaptive parallelism for (non-root) scan operations
(i.e. index scans). So far, such scans have always been executed with
maximal parallelism, such that each fragment were scanned in parallel.
This also meant that the available batch size would have to be divided by the
number of fragments. For index scans with large result sets, this was
inefficient, since it meant asking for many small batches.
As an example, assume that a table scan of t1 is followed by an index scan on
t2 where the index has poor selectivity. The query is then effectively a cross
join. When such queries where tested on distributed clusters with
multi-threaded ndbd, the queries would typically run five to six times slower
than with query pushdown disabled.
This commit will therefore try to set an optimal parallelism, depending on the
size of the scan result set. This eliminates the performance regression in
the test cases described above.
This works as follows:
* The first time an index scan is started, it starts with parallelism=1 for
the first batch.
* For the subsequent batches, the SPJ block will try to set the parallelism
such that all rows from a single fragment will fit in one batch. If this
is not possible, parallelism will remain 1.
* Whenever the index scan is started again (typically after receiving a new
batch for its parent), the SPJ block will try to guess the optimal parallelism
based on statistics from previous runs. 'Optimal' means that one can expect
all rows from a given fragment to be retrieved in one batch if possible.
See also quilt sereies published on 2011-08-18.
=== modified file 'storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp'
--- a/storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp 2011-08-22 08:35:35 +0000
+++ b/storage/ndb/src/kernel/blocks/dbspj/DbspjMain.cpp 2011-08-22 11:50:41 +0000
@@ -5050,8 +5050,9 @@ Dbspj::scanIndex_parent_batch_complete(S
* deviation to have a low risk of setting parallelism to high (as erring
* in the other direction is more costly).
- Int32 parallelism = data.m_parallelismStat.getMean()
- - 2 * data.m_parallelismStat.getStdDev();
+ Int32 parallelism =
+ - 2 * data.m_parallelismStat.getStdDev());
if (parallelism < 1)
@@ -5123,7 +5124,8 @@ Dbspj::scanIndex_parent_batch_complete(S
- ndbrequire((data.m_frags_outstanding + data.m_frags_complete) <=
+ ndbrequire(static_cast<Uint32>(data.m_frags_outstanding +
+ data.m_frags_complete) <=
No bundle (reason: useless for push emails).
|• bzr push into mysql-5.1-telco-7.0 branch (jan.wedvik:4472 to 4473) ||Jan Wedvik||22 Aug|