List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:May 2 2011 7:28am
Subject:Re: bzr commit into mysql-trunk branch (roy.lyseng:3367) Bug#12407753
View as plain text  
Hi Guilhem,

thanks for the review!

On 29.04.11 18.56, Guilhem Bichot wrote:
> Hello Roy,
>
> Roy Lyseng a écrit, Le 29.04.2011 16:09:
>> #At file:///home/rl136806/mysql/repo/mysql-work5/ based on
>> revid:tor.didriksen@stripped
>>
>> 3367 Roy Lyseng 2011-04-29
>> Bug#12407753: Time to compare a row is missing in cost calculation of semi-join
>> Stage 1: Change cost factors to be multiplicative Basically, this patch
>> changes the constant values TIME_FOR_COMPARE
>> and TIME_FOR_COMPARE_ROWID to be the inverse of their previous values
>> and gives them names that are more similar to other cost factors.
>> As a consequence, all cost calculations involving these constants
>> are changed from divisions to multiplications.
>
>> @@ -1235,7 +1235,7 @@ bool check_if_pq_applicable(Sort_param *
>> */
>> const double pq_cpu_cost= (PQ_slowness * num_rows +
> param->max_keys_per_buffer) *
>> - log((double) param->max_keys_per_buffer) / TIME_FOR_COMPARE_ROWID;
>> + log((double) param->max_keys_per_buffer) * ROWID_COMPARE_COST;
>
> log() takes a double argument (according to "man log" on my Linux) so the
> explicit cast of param->max_keys_per_buffer is maybe not needed.
> Same for other "log((double)x)" calls in the patch.
> I say "*maybe* not needed", because I haven't checked what the compiler thinks
> of my theories.

I agree, but there are some issues with log() (or maybe some other math 
function) that caused a broken build a few months ago, so I put in an explicit 
cast for safety. Jørgen had a problem with this in bug#30597.
>
>> const double pq_io_cost=
>> param->max_rows * table->file->scan_time() / 2.0;
>> const double pq_cost= pq_cpu_cost + pq_io_cost;
>>
>> === modified file 'sql/filesort_utils.cc'
>> --- a/sql/filesort_utils.cc 2010-12-17 09:41:21 +0000
>> +++ b/sql/filesort_utils.cc 2011-04-29 14:09:10 +0000
>> @@ -26,8 +26,7 @@ double get_merge_cost(ha_rows num_elemen
>> {
>> return 2.0 * ((double) num_elements * elem_size) / IO_SIZE
>> - + (double) num_elements * log((double) num_buffers) /
>> - (TIME_FOR_COMPARE_ROWID * M_LN2);
>> + + num_elements * log((double) num_buffers) * ROWID_COMPARE_COST / M_LN2;
>
> here I guess you removed the cast to double because in "integer*double" the
> integer is automatically cast to double IIRC.

Yep.
>
>> === modified file 'sql/sql_const.h'
>> --- a/sql/sql_const.h 2010-12-17 09:41:21 +0000
>> +++ b/sql/sql_const.h 2011-04-29 14:09:10 +0000
>> @@ -158,16 +158,16 @@
>>
>> /**
>> The following is used to decide if MySQL should use table scanning
>> - instead of reading with keys. The number says how many evaluation of the
>> - WHERE clause is comparable to reading one extra row from a table.
>> + instead of reading with keys. The number says how costly evaluation of the
>> + filter condition for a row is compared to reading one extra row from a table.
>> */
>> -#define TIME_FOR_COMPARE 5.0 // 5 compares == one read
>> +#define JOIN_COMPARE_COST 0.20
>
> and this has type "double" per the C++ standard (talking to myself here).

Yep again.

>
>> @@ -7582,18 +7582,18 @@ best_access_path(JOIN *join,
>> we read the table (see flush_cached_records for details). Here we
>> take into account cost to read and skip these records.
>> */
>> - tmp+= (s->records - rnd_records)/(double) TIME_FOR_COMPARE;
>> + tmp+= (s->records - rnd_records) * JOIN_COMPARE_COST;
>> }
>> }
>>
>> /*
>> We estimate the cost of evaluating WHERE clause for found records
>> - as record_count * rnd_records / TIME_FOR_COMPARE. This cost plus
>> + as record_count * rnd_records * JOIN_COMPARE_COST. This cost plus
>> tmp give us total cost of using TABLE SCAN
>> */
>> if (best == DBL_MAX ||
>> - (tmp + record_count/(double) TIME_FOR_COMPARE*rnd_records <
>> - best + record_count/(double) TIME_FOR_COMPARE*records))
>> + (tmp + (record_count * JOIN_COMPARE_COST * rnd_records) <
>> + best + (record_count * JOIN_COMPARE_COST * records)))
>
> the added () are not needed, as you know * has priority over +. I don't find
> that they add clarity. But keep them if you like.

I prefer to keep them, they do not hurt...
>
> Ok to push. I like reviewing those kind of patches, they don't exercise the
> brain too much :-))

Maybe I should do more Friday night commits then :)

Thanks,
Roy
Thread
bzr commit into mysql-trunk branch (roy.lyseng:3367) Bug#12407753Roy Lyseng29 Apr
  • Re: bzr commit into mysql-trunk branch (roy.lyseng:3367) Bug#12407753Guilhem Bichot29 Apr
    • Re: bzr commit into mysql-trunk branch (roy.lyseng:3367) Bug#12407753Roy Lyseng2 May