thanks for the review!
On 29.04.11 18.56, Guilhem Bichot wrote:
> Hello Roy,
> Roy Lyseng a écrit, Le 29.04.2011 16:09:
>> #At file:///home/rl136806/mysql/repo/mysql-work5/ based on
>> 3367 Roy Lyseng 2011-04-29
>> Bug#12407753: Time to compare a row is missing in cost calculation of semi-join
>> Stage 1: Change cost factors to be multiplicative Basically, this patch
>> changes the constant values TIME_FOR_COMPARE
>> and TIME_FOR_COMPARE_ROWID to be the inverse of their previous values
>> and gives them names that are more similar to other cost factors.
>> As a consequence, all cost calculations involving these constants
>> are changed from divisions to multiplications.
>> @@ -1235,7 +1235,7 @@ bool check_if_pq_applicable(Sort_param *
>> const double pq_cpu_cost= (PQ_slowness * num_rows +
> param->max_keys_per_buffer) *
>> - log((double) param->max_keys_per_buffer) / TIME_FOR_COMPARE_ROWID;
>> + log((double) param->max_keys_per_buffer) * ROWID_COMPARE_COST;
> log() takes a double argument (according to "man log" on my Linux) so the
> explicit cast of param->max_keys_per_buffer is maybe not needed.
> Same for other "log((double)x)" calls in the patch.
> I say "*maybe* not needed", because I haven't checked what the compiler thinks
> of my theories.
I agree, but there are some issues with log() (or maybe some other math
function) that caused a broken build a few months ago, so I put in an explicit
cast for safety. Jørgen had a problem with this in bug#30597.
>> const double pq_io_cost=
>> param->max_rows * table->file->scan_time() / 2.0;
>> const double pq_cost= pq_cpu_cost + pq_io_cost;
>> === modified file 'sql/filesort_utils.cc'
>> --- a/sql/filesort_utils.cc 2010-12-17 09:41:21 +0000
>> +++ b/sql/filesort_utils.cc 2011-04-29 14:09:10 +0000
>> @@ -26,8 +26,7 @@ double get_merge_cost(ha_rows num_elemen
>> return 2.0 * ((double) num_elements * elem_size) / IO_SIZE
>> - + (double) num_elements * log((double) num_buffers) /
>> - (TIME_FOR_COMPARE_ROWID * M_LN2);
>> + + num_elements * log((double) num_buffers) * ROWID_COMPARE_COST / M_LN2;
> here I guess you removed the cast to double because in "integer*double" the
> integer is automatically cast to double IIRC.
>> === modified file 'sql/sql_const.h'
>> --- a/sql/sql_const.h 2010-12-17 09:41:21 +0000
>> +++ b/sql/sql_const.h 2011-04-29 14:09:10 +0000
>> @@ -158,16 +158,16 @@
>> The following is used to decide if MySQL should use table scanning
>> - instead of reading with keys. The number says how many evaluation of the
>> - WHERE clause is comparable to reading one extra row from a table.
>> + instead of reading with keys. The number says how costly evaluation of the
>> + filter condition for a row is compared to reading one extra row from a table.
>> -#define TIME_FOR_COMPARE 5.0 // 5 compares == one read
>> +#define JOIN_COMPARE_COST 0.20
> and this has type "double" per the C++ standard (talking to myself here).
>> @@ -7582,18 +7582,18 @@ best_access_path(JOIN *join,
>> we read the table (see flush_cached_records for details). Here we
>> take into account cost to read and skip these records.
>> - tmp+= (s->records - rnd_records)/(double) TIME_FOR_COMPARE;
>> + tmp+= (s->records - rnd_records) * JOIN_COMPARE_COST;
>> We estimate the cost of evaluating WHERE clause for found records
>> - as record_count * rnd_records / TIME_FOR_COMPARE. This cost plus
>> + as record_count * rnd_records * JOIN_COMPARE_COST. This cost plus
>> tmp give us total cost of using TABLE SCAN
>> if (best == DBL_MAX ||
>> - (tmp + record_count/(double) TIME_FOR_COMPARE*rnd_records <
>> - best + record_count/(double) TIME_FOR_COMPARE*records))
>> + (tmp + (record_count * JOIN_COMPARE_COST * rnd_records) <
>> + best + (record_count * JOIN_COMPARE_COST * records)))
> the added () are not needed, as you know * has priority over +. I don't find
> that they add clarity. But keep them if you like.
I prefer to keep them, they do not hurt...
> Ok to push. I like reviewing those kind of patches, they don't exercise the
> brain too much :-))
Maybe I should do more Friday night commits then :)