On 2010-12-06 09:27, Øystein Grøvlen wrote:
> Hi Tor,
>
> I will approve this worklog now. See in-line for some minor
> suggestions that I hope you will consider.
>
> I am still not sure we have fully understood the cost aspects, but I
> seems we can be confident that it does much more good than harm, and
> that the risk of performance regressions are small.
>
> --
> Øystein
>
> On 11/29/10 12:21 PM, Tor Didriksen wrote:
> > #At file:///export/home/didrik/repo/next-mr-opt-team-wl1393-merge/
> based on revid:tor.didriksen@stripped
> >
> > 3250 Tor Didriksen 2010-11-29
> > WL#1393 Optimizing filesort with small limit
> >
> > Many web customers have to do
> > "SELECT ... ORDER BY non_index_column LIMIT X",
> >
> > When X *<Row Size> is smaller than sort_buff_size we can use
> > the following algoritm to speed up the sort:
> >
> > - Create a queue to hold 'limit' keys.
> > - Scan through the table and store the first (last if DESC)
> keys in the queue
> > - Return values from queue
> > @ libmysqld/CMakeLists.txt
> > Add Bounded_queue
> > @ libmysqld/Makefile.am.THIS
> > Add Bounded_queue
>
> ???
some bzr hiccup?
>
> > @ mysql-test/include/order_by.inc
> > Add new tests.
> > @ mysql-test/include/select.inc
> > Fix typo in bug number.
> > @ mysql-test/r/order_by_none.result
> > Added new tests.
> > @ mysql-test/r/order_by_sortkey.result
> > New test.
> > @ mysql-test/r/select_none.result
> > Fix typo in bug number.
> > @ mysql-test/t/order_by_sortkey.test
> > New test.
> > @ mysys/CMakeLists.txt
> > Compile standalone queues test executable.
> > @ mysys/queues.c
> > Fix un-maintained code: s/bool/my_bool/
> > @ sql/CMakeLists.txt
> > Add Bounded_queue to gunit tests.
> > @ sql/Makefile.am.THIS
> > Add Bounded_queue
>
> ???
I'll be patching the diff into a different tree anyways, so this will go
away.
>
> ...
>
> > + /**
> > + Function for comparing two keys.
> > + @param n Pointer to number of bytes to compare.
> > + @param a First key.
> > + @param b Second key.
> > + @retval -1, 0, or 1 depending on whether the left argument is
> > + less than, equal to, or greater than the right argument.
> > + */
> > + typedef int (*compare_function)(size_t *n, Key_type **a, Key_type
> **b);
> > +
> > + /**
> > + Initialize the queue.
> > +
> > + @param max_elements The size of the queue.
> > + @param offset_to_key Offset to key in elements stored in the
> queue.
>
> Didn't we agree to drop offset_to_key?
We didn't, but we do now.
>
> > + @param max_at_top Set to 1 if you want biggest element on top.
> > + @param compare Compare function for elements, takes 3
> arguments.
> > + If NULL, we use
> get_ptr_compare(compare_length).
> > + @param compare_length Length of the data (i.e. the keys) used
> for sorting.
> > + @param keymaker Function which generates keys for elements.
> > + @param sort_param Sort parameters.
> > + @param sort_keys Array of pointers to keys to sort.
> > +
> > + @retval 0 OK, 1 Could not allocate memory.
> > +
> > + We do *not* take ownership of any of the input pointer arguments.
> > + */
> > + int init(ha_rows max_elements, uint offset_to_key, bool max_at_top,
> > + compare_function compare, size_t compare_length,
> > + keymaker_function keymaker, Sort_param *sort_param,
> > + Key_type **sort_keys);
> > +
> > + /**
> > + Pushes an element on the queue.
> > + If the queue is already full, we discard one element.
> > + Calls keymaker_function to generate a key for the element.
> > +
> > + @param element The element to be pushed.
> > + */
> > + void push(Element_type *element);
> > +
> > + /**
> > + Removes the top element from the queue.
> > +
> > + @retval Pointer to the (key of the) removed element.
> > + */
> > + Key_type **pop()
> > + {
> > + // Don't return the extra element to the client code.
> > + if (queue_is_full((&m_queue)))
> > + queue_remove(&m_queue, 0);
> > + return reinterpret_cast<Key_type**>(queue_remove(&m_queue, 0));
> > + }
>
> I think we agreed to add a comment that said that pop() had been
> implemented for testing purposes, and is not fully tested.
>
done.
> ...
>
> > + /*
> > + PQ has cost:
> > + (insert + qsort) * log(queue size) / TIME_FOR_COMPARE_ROWID +
> > + cost of file lookup afterwards.
> > + */
> > + const double pq_cost=
> > + (PQ_slowness * num_rows + param->num_keys_per_buffer) *
> > + log(param->max_rows + 1.0) / TIME_FOR_COMPARE_ROWID +
> > + param->max_rows * table->file->scan_time();
>
> As discussed on Friday, param->num_keys_per_buffer == param->max_rows
> + 1.0
> at this point. I think it would be more obvious if just one of these
> variables are used in this formula.
>
OK
> ...
>
> > +
> > +/*
> > + Some basic performance testing, to compute the overhead of
> Bounded_queue.
> > + Run the with 'bounded_queue-t --disable-tap-output' to see the
> > + millisecond output from Google Test.
> > + */
> > +const int num_rows= 10000;
> > +const int row_limit= 100;
> > +const int num_iterations= 1000;
> > +
> > +class PerfTestSmall : public ::testing::Test
> > +{
> > +public:
> > + /*
> > + The extra overhead of malloc/free should be part of the
> measurement,
> > + so we do not define the key container as a member here.
> > + */
> > + typedef Key_container<row_limit, int> Container;
> > + enum { limit= row_limit };
> > +};
> > +
> > +class PerfTestLarge : public ::testing::Test
> > +{
> > +public:
> > + /*
> > + The extra overhead of malloc/free should be part of the
> measurement,
> > + so we do not define the key container as a member here.
> > + */
> > + typedef Key_container<num_rows, int> Container;
> > + enum { limit= num_rows };
> > +};
> > +
> > +
> > +template<int limit>
> > +void insert_and_sort()
> > +{
> > + typedef Key_container<limit, int> Container;
> > + for (int it= 0; it< num_iterations; ++it)
> > + {
> > + Container *keys= new Container;
> > + srand(0);
> > + Bounded_queue<int, int> queue;
> > + EXPECT_EQ(0, queue.init(limit, 0, true, int_ptr_compare,
> > + sizeof(int),&int_keymaker, NULL,
> keys->key_ptrs));
> > + for (int ix= 0; ix< limit; ++ix)
>
> Should not this loop do num_rows iterations?
> Otherwise, the name of the constants are a bit misleading.
> The way it is now, it should rather be num_rows_large instead of
> num_rows and num_rows_small instead of row_limit.
ah yes, thanks.
>
> > + {
> > + int data= rand();
> > + queue.push(&data);
> > + }
> > + my_string_ptr_sort((uchar*)&keys->key_ptrs[0],
> > + queue.num_elements(), sizeof(int));
> > + delete keys;
> > + }
> > +}