List:Commits« Previous MessageNext Message »
From:Øystein Grøvlen Date:November 27 2010 11:19pm
Subject:Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393
View as plain text  
It seems I forgot to send this to the commit list:

-------- Original Message --------
Subject: Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393
Date: Fri, 26 Nov 2010 16:19:06 +0100
From: Øystein Grøvlen <oystein.grovlen@stripped>
Organization: Oracle
To: Tor Didriksen <tor.didriksen@stripped>

Hi Tor,

Here are a few comments to the latest patch.  I have taken a closer
look at the cost function.  Except for that, there are only a few
minor comments.


On 11/24/10 02:24 PM, Tor Didriksen wrote:
> #At file:///export/home/didrik/repo/next-mr-opt-team-wl1393-merge/
based on revid:tor.didriksen@stripped
>
>   3247 Tor Didriksen	2010-11-24
>        WL#1393 Optimizing filesort with small limit
>
>        Many web customers have to do
>        "SELECT ... ORDER BY non_index_column LIMIT X",
>
>        When X *<Row Size>  is smaller than sort_buff_size we can use
>        the following algoritm to speed up the sort:
>
>        - Create a queue to hold 'limit' keys.
>        - Scan through the table and store the first (last if DESC)
keys in the queue
>        - Return values from queue
>       @ libmysqld/CMakeLists.txt
>          Add Bounded_queue
>       @ libmysqld/Makefile.am
>          Add Bounded_queue
>       @ mysql-test/include/order_by.inc
>          Add new tests.
>       @ mysql-test/include/select.inc
>          Fix typo in bug number.
>       @ mysql-test/r/order_by_none.result
>          Added new tests.
>       @ mysql-test/r/order_by_sortkey.result
>          New test.
>       @ mysql-test/r/select_none.result
>          Fix typo in bug number.
>       @ mysql-test/t/order_by_sortkey.test
>          New test.
>       @ mysys/CMakeLists.txt
>          Compile standalone queues test executable.
>       @ mysys/queues.c
>          Fix un-maintained code: s/bool/my_bool/
>       @ sql/CMakeLists.txt
>          Add Bounded_queue to gunit tests.
>       @ sql/Makefile.am
>          Add Bounded_queue
>       @ sql/bounded_queue.cc
>          Since Bounded_queue is templatized, the entire
implementation must go in the .h file
>          The free function get_merge_many_buffs_cost_fast is here so
that we can unit test it.
>       @ sql/bounded_queue.h
>          This is a wrapper on top of QUEUE and the queue_xxx() functions.
>       @ sql/filesort.cc
>          Add support for Priority Queue sort for queries with a LIMIT.
>
>          Simplify usage of make_char_array(), let it
allocate/initialize FILESORT_INFO.sort_keys directly.
>          Rename SORTPARAM to Sort_param, and add init() function.
>          Remove dead code (obsolete #ifdef)
>          Remove dead code (indexfile was alwayss NULL in find_all_keys())

Typo: alwayss

>       @ sql/filesort.h
>          New argument to filesort()
>       @ sql/sql_const.h
>          Use double, rather than int, literals (avoids lots of casting).
>       @ sql/sql_delete.cc
>          New argument to filesort()
>       @ sql/sql_select.cc
>          Handle new filesort() using priority queue:
>          - Add some more DBUG_PRINT statements before calls to
create_sort_index()
>          - create_sort_index() can use the new PQ algorithm if
(!group&&  tables==1)
>          - use information from filesort() if we have OPTION_FOUND_ROWS
>          - in end_send(): stop if we are about to read more rows than
allocated by the PQ
>          - pass OPTION_FOUND_ROWS on to filesort() for single-table
queries.
>       @ sql/sql_sort.h
>          Rename st_sort_param/SORTPARAM to Sort_param.
>          Add constructor.
>          Add init() function.
>       @ sql/sql_table.cc
>          New argument to filesort()
>       @ sql/sql_update.cc
>          New argument to filesort()
>       @ sql/uniques.cc
>          Rename SORTPARAM to Sort_param.
>       @ unittest/gunit/CMakeLists.txt
>          Add gunit test for Bounded_queue
>       @ unittest/gunit/bounded_queue-t.cc
>          Add new unit tests for Bounded_queue and merge-cost estimation.
>          Add performance tests for Bounded_queue.
>
>      added:
>        mysql-test/r/order_by_sortkey.result
>        mysql-test/t/order_by_sortkey.test
>        sql/bounded_queue.cc
>        sql/bounded_queue.h
>        unittest/gunit/bounded_queue-t.cc
>      modified:
>        libmysqld/CMakeLists.txt
>        libmysqld/Makefile.am
>        mysql-test/include/order_by.inc
>        mysql-test/include/select.inc
>        mysql-test/r/order_by_none.result
>        mysql-test/r/select_none.result
>        mysys/CMakeLists.txt
>        mysys/queues.c
>        sql/CMakeLists.txt
>        sql/Makefile.am
>        sql/filesort.cc
>        sql/filesort.h
>        sql/sql_const.h
>        sql/sql_delete.cc
>        sql/sql_select.cc
>        sql/sql_sort.h
>        sql/sql_table.cc
>        sql/sql_update.cc
>        sql/uniques.cc
>        unittest/gunit/CMakeLists.txt

...

> === modified file 'mysys/queues.c'
> --- a/mysys/queues.c	2010-07-08 21:20:08 +0000
> +++ b/mysys/queues.c	2010-11-24 13:24:14 +0000
> @@ -381,11 +381,11 @@ static uint tot_no_parts= 0;
>   static uint tot_no_loops= 0;
>   static uint expected_part= 0;
>   static uint expected_num= 0;
> -static bool max_ind= 0;
> -static bool fix_used= 0;
> +static my_bool max_ind= 0;
> +static my_bool fix_used= 0;
>   static ulonglong start_time= 0;
>
> -static bool is_divisible_by(uint num, uint divisor)
> +static my_bool is_divisible_by(uint num, uint divisor)
>   {
>     uint quotient= num / divisor;
>     if (quotient * divisor == num)
> @@ -479,6 +479,7 @@ static int test_compare(void *null_arg,
>     uint a_num= (*(uint*)a)&  0x3FFFFF;
>     uint b_num= (*(uint*)b)&  0x3FFFFF;
>     uint a_part, b_part;
> +  (void) null_arg;

What is the purpose of this?

>     if (a_num>  b_num)
>       return +1;
>     if (a_num<  b_num)

...

> === added file 'sql/bounded_queue.cc'
> --- a/sql/bounded_queue.cc	1970-01-01 00:00:00 +0000
> +++ b/sql/bounded_queue.cc	2010-11-24 13:24:14 +0000
> @@ -0,0 +1,86 @@
> +/* Copyright (c) 2010, Oracle and/or its affiliates. All rights
reserved.
> +
> +   This program is free software; you can redistribute it and/or modify
> +   it under the terms of the GNU General Public License as published by
> +   the Free Software Foundation; version 2 of the License.
> +
> +   This program is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +   GNU General Public License for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with this program; if not, write to the Free Software
> +   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307  USA */
> +
> +#include "bounded_queue.h"
> +#include "sql_const.h"
> +#include "sql_sort.h"
> +
> +
> +namespace {
> +/**
> +   A local helper function. See comments for get_merge_buffers_cost().
> + */
> +double get_merge_cost(ha_rows num_elements, ha_rows num_buffers,
uint elem_size)
> +{
> +  return
> +    2.0 * ((double) num_elements * elem_size) / IO_SIZE
> +    + (double) num_elements * log((double) num_buffers) /
> +      (TIME_FOR_COMPARE_ROWID * M_LN2);
> +}
> +}

Why not let this function be shared with uniques.cc?  (E.g., put it's
declaration in filesort.h)

It seems a bit unnecessary to include the IO cost in this formula.
One knows that all elements needs to be read and written once for
every iteration of the outer while loop, why not just compute the
total I/O in one go.  (And the number of iterations of the outer loop
is about log_MERGEBUFF(num_rows), so probably all I/O could be computed
at once.)

> +
> +/**
> +   This is a simplified, and faster version of @see
get_merge_many_buffs_cost().
> +   We calculate the cost of merging buffers, by simulating the actions
> +   of @see merge_many_buff. For explanations of formulas below,
> +   see comments for get_merge_buffers_cost().
> +   TODO: Use this function for Unique::get_use_cost().
> +*/

Thanks for adding some explanation on the background for the formulas.
I think it would help the understanding if it said somewhere what
merge_many_buff actually does.

> +double get_merge_many_buffs_cost_fast(ha_rows num_rows,
> +                                      ha_rows num_keys_per_buffer,
> +                                      uint    elem_size)
> +{
> +  ha_rows num_buffers= num_rows / num_keys_per_buffer;
> +  ha_rows last_n_elems= num_rows % num_keys_per_buffer;
> +  double total_cost;
> +
> +  /* Calculate CPU cost of sorting buffers. */
> +  total_cost=
> +    ( num_buffers * num_keys_per_buffer * log(1.0 +
num_keys_per_buffer) +
> +      last_n_elems * log(1.0 + last_n_elems) )
> +    / TIME_FOR_COMPARE_ROWID;
> +
> +  /* Simulate behavior of merge_many_buff(). */
> +  while (num_buffers>= MERGEBUFF2)
> +  {

Are you sure it should not be num_buffers+1 here?  I guess the last
buffer that is not full, also needs to be taken into account.

> +    /* Calculate # of calls to merge_buffers() */
> +    const ha_rows loop_limit= num_buffers - MERGEBUFF*3/2;
> +    const ha_rows num_merge_calls= 1 + loop_limit/MERGEBUFF;

If you insert the formula for loop_limit into the expression for
num_merge_calls, it seems this can be simplified to:

num_merge_calls= (num_buffers - MERGEBUFF/2)/MERGEBUFF;


> +    const ha_rows i_end_value= num_merge_calls * MERGEBUFF;

I rather suggest the following const to be used instead of
i_end_value:

const ha_rows num_remaining_buffs= num_buffers - num_merge_calls *
MERGEBUFF;

I think that better explain its role in the merge algorithm.

> +
> +    /* Cost of merge sort 'num_merge_calls' */
> +    total_cost+=
> +      num_merge_calls *
> +      get_merge_cost(num_keys_per_buffer * MERGEBUFF, MERGEBUFF,
elem_size);
> +
> +    /* # of records in remaining buffers. */
> +    last_n_elems+= (num_buffers - i_end_value) * num_keys_per_buffer;
> +
> +    /* Cost of merge sort of remaining buffers. */
> +    total_cost+= get_merge_cost(last_n_elems,
> +                                num_buffers - i_end_value, elem_size);

At least here, the numbers of buffers should be increased by one.
There are (num_buffers - i_end_value) full buffers, according to how
last_n_elems is computed and then there is the additional non-full
buffer.

> +
> +    num_buffers= num_merge_calls;
> +    num_keys_per_buffer*= MERGEBUFF;

Which means that the number of iterations are about
log_MERGEBUFF(num_rows).  Maybe that could be used to simplify this
computation even further.

> +  }
> +
> +  /* Simulate final merge_buff call. */
> +  last_n_elems+= num_keys_per_buffer * num_buffers;
> +  total_cost+= get_merge_cost(last_n_elems, num_buffers, elem_size);

I guess it should be num_buffers+1 here. (One for each merge call plus
one for the non-full merge.)

> +
> +  return total_cost;
> +}
> +
> +
>
> === added file 'sql/bounded_queue.h'

...

> +  /**
> +     Function for comparing two keys.
> +     @param  n Pointer to number of bytes to compare.
> +     @param  a First key.
> +     @param  b Second key.
> +     @retval -1, 0, or 1 depending on whether the left argument is
> +             less than, equal to, or greater than the right argument.
> +   */
> +  typedef int (*compare_function)(size_t *v, Key_type **a, Key_type
**b);
> +
> +  /**
> +    Initialize the queue.
> +
> +    @param max_elements   The size of the queue.
> +    @param offset_to_key  Offset to key in elements stored in the queue.
> +    @param max_at_top     Set to 1 if you want biggest element on top.

Set to true, maybe?
To check whether I have understood this correctly: if max_at_top=true,
then queue_top(&m_queue) will refers to the smallest element.  That
is, the concept of top is opposite for bounded_queue and the
underlying QUEUE.  (Not anything wrong with that, just want it to be
clear.)

> +    @param compare        Compare function for elements, takes 3
arguments.
> +                          If NULL, we use
get_ptr_compare(compare_length).
> +    @param compare_length Length of the data (i.e. the keys) used
for sorting.
> +    @param keymaker       Function which generates keys for elements.
> +    @param sort_param     Sort parameters.
> +    @param sort_keys      Array of pointers to keys to sort.
> +
> +    @retval 0 OK, 1 Could not allocate memory.
> +
> +    We do *not* take ownership of any of the input pointer arguments.
> +   */
> +  int init(ha_rows max_elements, uint offset_to_key, bool max_at_top,
> +           compare_function compare, size_t compare_length,
> +           keymaker_function keymaker, Sort_param *sort_param,
> +           Key_type **sort_keys);
> +
> +  /**
> +    Pushes an element on the queue.
> +    If the queue is already full, we discard one element.

"... discard the bottom element"?

> +    Calls keymaker_function to generate a key for the element.
> +
> +    @param element        The element to be pushed.
> +   */
> +  void push(Element_type *element);
> +

...

> +/*
> +  Verifies that push and bounded size works, and that pop() gives
sorted order.
> + */
> +TEST_F(BoundedQueueTest, push_and_pop)
> +{
> +  EXPECT_EQ(0, m_queue.init(num_elements/2, 0, true, test_key_compare,
> +                            m_key_size,
> +&test_keymaker, NULL, m_keys.key_ptrs));
> +  for (int ix= 0; ix<  array_size(m_test_data); ++ix)
> +  {
> +    m_queue.push(&m_test_data[ix]);
> +  }
> +  int expected_key_val= m_queue.num_elements() - 2;

I think the -2 needs an explanation.

...
Thread
bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Tor Didriksen22 Nov
  • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Jorgen Loland24 Nov
    • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Tor Didriksen24 Nov
      • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Øystein Grøvlen25 Nov
        • Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Øystein Grøvlen25 Nov
Re: bzr commit into mysql-trunk branch (tor.didriksen:3247) WL#1393Øystein Grøvlen28 Nov