List:Commits« Previous MessageNext Message »
From:Jorgen Loland Date:November 29 2010 10:03am
Subject:Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393
View as plain text  
Tor,

Thank you for implementing priority queue.

I've written comments inline. For faster follow-up review it would be
great if you could commit the next changeset without uncommitting this
one.

 > === modified file 'mysql-test/include/order_by.inc'
 > --- mysql-test/include/order_by.inc	2010-09-27 13:20:24 +0000
 > +++ mysql-test/include/order_by.inc	2010-11-25 09:01:19 +0000
 > @@ -1364,6 +1364,163 @@ ORDER BY t2.c LIMIT 1;
 >
 > +################
 > +## Test group & having
 > +SELECT floor(f1/10) f3, count(f2) FROM t1
 > +group by 1 ORDER BY 2,1 LIMIT 5;

JL: Please upper-case keywords

 > +SELECT floor(f1/10) f3, count(f2) FROM t1
 > +group by 1 ORDER BY 2,1 LIMIT 0;

JL: Please upper-case keywords

 > === modified file 'mysql-test/r/order_by_none.result'
 > --- mysql-test/r/order_by_none.result	2010-09-27 13:20:24 +0000
 > +++ mysql-test/r/order_by_none.result	2010-11-25 09:01:19 +0000
 > +SELECT SQL_CALC_FOUND_ROWS * FROM t1
 > +ORDER BY f1, f0 LIMIT 0;
 > +f0	f1	f2
 > +SELECT FOUND_ROWS();
 > +FOUND_ROWS()
 > +500

JL: Wow... I didn't realize until now that we actually do sorting even
if LIMIT 0 is requested. This makes me wonder if you should have a
small optimization in Bounded_queue::push to not create and push
element keys to the underlying queue if the LIMIT is 0. Counting
# records should be enough.

JL: Have order_by_all and order_by_icp_mrr intentionally not been
updated (mtr --record)?

 > === added file 'mysql-test/t/order_by_sortkey.test'
 > --- mysql-test/t/order_by_sortkey.test	1970-01-01 00:00:00 +0000
 > +++ mysql-test/t/order_by_sortkey.test	2010-11-25 09:01:19 +0000
 > @@ -0,0 +1,66 @@
 > +#
 > +# WL#1393 - ORDER BY with LIMIT tests
 > +#
 > +# A big test in a separate file, so that we can run the original
 > +# order_by test with --debug without timeout.
 > +#
 > +# All the sort keys fit in memory, but the addon fields do not.
 > +#
 > +CREATE TABLE t1(
 > +  f0 int auto_increment PRIMARY KEY,
 > +  f1 int,
 > +  f2 varchar(200)
 > +) ENGINE=MyISAM;

JL: Does this test require MyISAM to work?

 > === added file 'sql/bounded_queue.cc'
 > --- sql/bounded_queue.cc	1970-01-01 00:00:00 +0000
 > +++ sql/bounded_queue.cc	2010-11-25 09:01:19 +0000
 > +namespace {
 > +/**
 > +   A local helper function. See comments for get_merge_buffers_cost().
 > + */
 > +double get_merge_cost(ha_rows num_elements, ha_rows num_buffers, uint elem_size)

JL: Long line

 > +/**
 > +   This is a simplified, and faster version of @see get_merge_many_buffs_cost().

JL: Long line (btw, there is a helpful emacs mode: blank-mode)

 > === added file 'sql/bounded_queue.h'
 > --- sql/bounded_queue.h	1970-01-01 00:00:00 +0000
 > +++ sql/bounded_queue.h	2010-11-26 14:13:02 +0000
 > @@ -0,0 +1,199 @@
 > (...)
 > +  Key_type **pop()
 > +  {
 > +    // Don't return the extra element to the client code.
 > +    if (queue_is_full((&m_queue)))
 > +      queue_remove(&m_queue, 0);
 > +    return reinterpret_cast<Key_type**>(queue_remove(&m_queue, 0));
 > +  }

JL: I suggest a DBUG_ASSERT(m_queue.elements>0); and a unit test that
hits it.

 > === modified file 'sql/filesort.cc'
 > --- sql/filesort.cc	2010-08-04 10:34:01 +0000
 > +++ sql/filesort.cc	2010-11-25 09:01:19 +0000
 > @@ -44,30 +50,68 @@ if (my_b_write((file),(uchar*) (from),pa
 > -static ha_rows find_all_keys(SORTPARAM *param,SQL_SELECT *select,
 > -			     uchar * *sort_keys, IO_CACHE *buffer_file,
 > -			     IO_CACHE *tempfile,IO_CACHE *indexfile);
 > +static ha_rows find_all_keys(Sort_param *param,SQL_SELECT *select,
 > +                             uchar **sort_keys, IO_CACHE *buffer_file,
 > +                             IO_CACHE *tempfile,
 > +                             Bounded_queue<uchar, uchar> *pq,
 > +                             ha_rows *found_rows);
 >
 > +static bool check_if_pq_applicable(Sort_param *param, FILESORT_INFO *info,
 > +                                   TABLE *table,
 > +                                   ha_rows records, ulong memory_available);
 > +
 > +
 > +void Sort_param::init(uint sortlen, TABLE *table, ulong 
max_length_for_sort_data,

JL: long line

 > @@ -147,117 +191,95 @@ ha_rows filesort(THD *thd, TABLE *table,
 > +    DBUG_PRINT("info", ("filesort PQ is not applicable"));
 > +
 > +    const ulong min_sort_memory=
 > +      max(MIN_SORT_MEMORY, param.sort_length*MERGEBUFF2);
 > +    while (memory_available >= min_sort_memory)
 > +    {
 > +      ulong old_memory_available;
 > +      ulong keys= memory_available / (param.rec_length+sizeof(char*));
 > +      param.num_keys_per_buffer= (uint) min(num_rows, keys);
 > +      make_char_array(&table_sort, param.num_keys_per_buffer,
param.rec_length);
 > +      if (table_sort.sort_keys)
 > +        break;
 > +      old_memory_available= memory_available;
 > +      if ((memory_available= memory_available/4*3) < min_sort_memory &&

JL: I prefer moving "memory_available= memory_available/4*3;" outside
the if(), and also move declaration of old_memory_available to the
only place it is assigned:

ulong old_memory_available= memory_available;
memory_available= memory_available/4*3;
if (...)

This is a preference, not a requirement for approval.

 > +          old_memory_available > min_sort_memory)
 > +        memory_available= min_sort_memory;
 > +    }
 > +    if (memory_available < min_sort_memory)
 > +    {
 > +      my_error(ER_OUT_OF_SORTMEMORY,MYF(ME_ERROR+ME_WAITTANG));
 > +      goto err;
 > +    }
 >    }
 > +
 > +  sort_keys= table_sort.sort_keys;
 >    if (open_cached_file(&buffpek_pointers,mysql_tmpdir,TEMP_PREFIX,
 >  		       DISK_BUFFER_SIZE, MYF(MY_WME)))
 >      goto err;
 >
 > -  param.keys--;  			/* TODO: check why we do this */
 >    param.sort_form= table;
 >    param.end=(param.local_sortorder=sortorder)+s_length;
 > -  if ((records=find_all_keys(&param,select,sort_keys, &buffpek_pointers,
 > -			     &tempfile, selected_records_file)) ==
 > +  if ((num_rows= find_all_keys(&param, select, sort_keys,
&buffpek_pointers,
 > +                               &tempfile,
 > +                               pq.is_initialized() ? &pq : NULL,
 > +                               &found_rows)) ==

JL: Again, I prefer assignments outside the if()

num_rows= find_all_keys();
if (num_rows==HA_POS_ERROR){}


 > @@ -299,9 +322,17 @@ ha_rows filesort(THD *thd, TABLE *table,
 >  		    outfile))
 >        goto err;
 >    }
 > -  if (records > param.max_rows)
 > -    records=param.max_rows;
 > -  error =0;
 > +
 > +  if (pq.is_initialized() && (options & OPTION_FOUND_ROWS))
 > +  {
 > +    DBUG_PRINT("info", ("filesort PQ and OPTION_FOUND_ROWS"));
 > +  }

JL: Why not move this DBUG_PRINT to the if (check_if_pq_applicable...)
block?

 > +
 > +  if (options & OPTION_FOUND_ROWS)
 > +    num_rows= found_rows;

JL: I thought that
* num_rows is the number of records returned by filesort, and
* found_rows is the number of records that would have been returned if
   there was no LIMIT.

Why this assignment? Are you changing the meaning of the return value
based on whether or not OPTION_FOUND_ROWS is specified?

If OPTION_FOUND_ROWS is set, then retval means total number of
records, otherwise retval means number of records returned? If so, I
think you should rather add another out-parameter for found_rows.
Also, the documentation for retval does not mention this.

 > @@ -364,25 +396,26 @@ void filesort_free_buffers(TABLE *table,
 >    table->sort.addon_field= NULL;
 >  }
 >
 > -/** Make a array of string pointers. */
 > +/** Makes an array of string pointers for info->sort_keys. */

JL: Please document parameters

 > -static char **make_char_array(char **old_pos, register uint fields,
 > -                              uint length, myf my_flag)
 > +static void make_char_array(FILESORT_INFO *info, uint fields, uint length)
 >  {
 > -  register char **pos;
 > -  char *char_pos;
 >    DBUG_ENTER("make_char_array");
 >
 > -  if (old_pos ||
 > -      (old_pos= (char**) my_malloc((uint) fields*(length+sizeof(char*)),
 > -				   my_flag)))
 > +  if (!info->sort_keys)
 > +    info->sort_keys=
 > +      (uchar**) my_malloc(fields * (length + sizeof(uchar*)), MYF(0));
 > +
 > +  if (info->sort_keys)
 >    {
 > -    pos=old_pos; char_pos=((char*) (pos+fields)) -length;
 > -    while (fields--) *(pos++) = (char_pos+= length);
 > +    uchar **pos= info->sort_keys;
 > +    uchar *char_pos= ((uchar*) (pos+fields)) - length;
 > +    while (fields--)
 > +      *(pos++) = (char_pos+= length);

JL: Incredible! This function does the same as before, but now it's
possible to understand what that is! Btw: space before '='

 > @@ -497,10 +537,12 @@ static void dbug_print_record(TABLE *tab
 >      HA_POS_ERROR on error.
 >  */
 >
 > -static ha_rows find_all_keys(SORTPARAM *param, SQL_SELECT *select,
 > -			     uchar **sort_keys,
 > -			     IO_CACHE *buffpek_pointers,
 > -			     IO_CACHE *tempfile, IO_CACHE *indexfile)
 > +static ha_rows find_all_keys(Sort_param *param, SQL_SELECT *select,
 > +                             uchar **sort_keys,
 > +                             IO_CACHE *buffpek_pointers,
 > +                             IO_CACHE *tempfile,
 > +                             Bounded_queue<uchar, uchar> *pq,
 > +                             ha_rows *found_rows)

JL: It's not intuitive what number of FOUND_ROWS and number of written
records are, and how they are different. After this review process I
know what they are, but it could be made more explicit.

 > @@ -512,11 +554,12 @@ static ha_rows find_all_keys(SORTPARAM *
 >    handler *file;
 >    MY_BITMAP *save_read_set, *save_write_set;
 >    bool skip_record;
 > +
 >    DBUG_ENTER("find_all_keys");
 >    DBUG_PRINT("info",("using: %s",
 >                       (select ? select->quick ? "ranges" : "where":
 >                        "every row")));
 > -
 > +

JL: You added two spaces on this line

 > @@ -1036,17 +1080,17 @@ static void register_used_fields(SORTPAR
 > +bool check_if_pq_applicable(Sort_param *param,
 > +                            FILESORT_INFO *filesort_info,
 > +                            TABLE *table, ha_rows num_rows,
 > +                            ulong memory_available)
 > +{
 > +  DBUG_ENTER("check_if_pq_applicable");
 > +
 > +  /*
 > +    How much Priority Queue sort is slower than qsort.
 > +    For qsort we have an average case complexity of O(n*log(n)) comparisons.
 > +    When using a priority queue, we have log(n) comparisons each time we
 > +    push a new element. For PQ we also call qsort at the end.
 > +    Measurements show that there is some extra overhead in QUEUE,
 > +    so we set it to 3.0 rather than 2.0.
 > +  */
 > +  const double PQ_slowness= 3.0;
 > +
 > +  if (param->max_rows == HA_POS_ERROR)
 > +  {
 > +    DBUG_PRINT("info", ("max_rows = HA_POS_ERROR"));

JL: Is "No LIMIT specified" a correct interpretation of this? If so,
wouldn't that text be easier to understand.

 > +    DBUG_RETURN(NULL);
 > +  }
 > +
 > +  if (param->max_rows + 2 >= UINT_MAX)
 > +  {
 > +    DBUG_PRINT("info", ("Too large LIMIT"));
 > +    DBUG_RETURN(NULL);
 > +  }
 > +
 > +  ulong num_available_keys=
 > +    memory_available / (param->rec_length + sizeof(char*));
 > +  param->num_keys_per_buffer= (uint) param->max_rows + 1;

JL: Is this +1 due to the replace element in queue? Comment would be nice.

 > +
 > +  if (num_rows < num_available_keys)
 > +  {
 > +    /* The whole source set fits into memory. */
 > +    if (param->max_rows < num_rows/PQ_slowness )
 > +    {
 > +      make_char_array(filesort_info,
 > +                      param->num_keys_per_buffer, param->rec_length);
 > +      if (filesort_info->sort_keys)
 > +        DBUG_RETURN(true);
 > +    }
 > +    else
 > +    {
 > +      /* PQ will be slower */
 > +      DBUG_RETURN(false);
 > +    }
 > +  }
 > +
 > +  if (param->num_keys_per_buffer < num_available_keys)
 > +  {

JL: Comment this block like you did for the "whole set fits in memory"
block above

 > @@ -1249,8 +1431,7 @@ int merge_buffers(SORTPARAM *param, IO_C
 >    {
 >      buffpek->base= strpos;
 >      buffpek->max_keys= maxcount;
 > -    strpos+= (uint) (error= (int) read_to_buffer(from_file, buffpek,
 > - 
rec_length));
 > +    strpos+= (uint) (error= (int)read_to_buffer(from_file, buffpek, 
rec_length));

JL: Long line

 > @@ -1554,7 +1735,8 @@ sortlength(THD *thd, SORT_FIELD *sortord

JL: Update documentation for this function

 > === modified file 'sql/sql_select.cc'
 > --- sql/sql_select.cc	2010-11-25 08:03:16 +0000
 > +++ sql/sql_select.cc	2010-11-25 09:01:19 +0000
 > @@ -3251,12 +3255,25 @@ JOIN::exec()
 > +        curr_join->group_list ? curr_join->group_list : curr_join->order;
 > +      /*
 > +        filesort_limit:	Max number of rows that needs to be sorted.
 > +        We can use select_limit_cnt only if we have no group_by and 1 table.
 > +       */
 > +      const ha_rows filesort_limit_arg=
 > +        (has_group_by || curr_join->tables > 1)
 > +        ? curr_join->select_limit : unit->select_limit_cnt;

JL: I did some testing for the change above and tried to find cases
where unit->select_limit_cnt was chosen and also was different from
curr_join->select_limit. In the order_by tests, this only happens if
select_limit was set to HA_POS_ERROR in this if() around line 1770

   if (having || (select_options & OPTION_FOUND_ROWS))
     select_limit= HA_POS_ERROR;

I think you said that we previously had to sort without LIMIT in the
case of OPTION_FOUND_ROWS due to a poor FOUND_ROWS implementation. As
far as I understand, the OPTION_FOUND_ROWS part of the if above can
now be removed and filesort_limit_arg can be reverted to
curr_join->select_limit.

If this is not correct I kindly ask that you provide a counter example
and elaborat in the comment about filesort_limit above.

 > === added file 'unittest/gunit/bounded_queue-t.cc'
 > --- unittest/gunit/bounded_queue-t.cc	1970-01-01 00:00:00 +0000
 > +++ unittest/gunit/bounded_queue-t.cc	2010-11-25 09:01:19 +0000
 > @@ -0,0 +1,404 @@
 > +/*
 > +  A test of the function get_merge_many_buffs_cost_fast()
 > + */
 > +TEST(CostEstimationTest, merge_many_buff)
 > +{
 > +  ha_rows num_rows= 512;
 > +  ulong num_keys= 100;
 > +  ulong row_lenght= 100;
 > +  double prev_cost= 0.0;
 > +  while (num_rows <= MAX_FILE_SIZE/4)
 > +  {
 > +    double merge_cost=
 > +      get_merge_many_buffs_cost_fast(num_rows, num_keys, row_lenght);
 > +    EXPECT_LT(0.0, merge_cost);
 > +    EXPECT_LT(prev_cost, merge_cost);
 > +    num_rows*= 2;
 > +    prev_cost= merge_cost;
 > +  }
 > +}

JL: Can we compare to the cost we get from get_merge_many_buffs_cost() here?

-- 
Jørgen Løland | Senior Software Engineer | +47 73842138
Oracle MySQL
Trondheim, Norway
Thread
bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Tor Didriksen24 Nov
  • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Jorgen Loland29 Nov
    • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Øystein Grøvlen29 Nov
      • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Jorgen Loland29 Nov
      • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Tor Didriksen1 Dec
    • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Tor Didriksen30 Nov
      • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Jorgen Loland3 Dec