From: Tor Didriksen Date: December 6 2010 9:47am Subject: Re: bzr commit into mysql-trunk-bugfixing branch (tor.didriksen:3250) WL#1393 List-Archive: http://lists.mysql.com/commits/126112 Message-Id: <4CFCB123.6070804@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit On 2010-12-06 09:27, Øystein Grøvlen wrote: > Hi Tor, > > I will approve this worklog now. See in-line for some minor > suggestions that I hope you will consider. > > I am still not sure we have fully understood the cost aspects, but I > seems we can be confident that it does much more good than harm, and > that the risk of performance regressions are small. > > -- > Øystein > > On 11/29/10 12:21 PM, Tor Didriksen wrote: > > #At file:///export/home/didrik/repo/next-mr-opt-team-wl1393-merge/ > based on revid:tor.didriksen@stripped > > > > 3250 Tor Didriksen 2010-11-29 > > WL#1393 Optimizing filesort with small limit > > > > Many web customers have to do > > "SELECT ... ORDER BY non_index_column LIMIT X", > > > > When X * is smaller than sort_buff_size we can use > > the following algoritm to speed up the sort: > > > > - Create a queue to hold 'limit' keys. > > - Scan through the table and store the first (last if DESC) > keys in the queue > > - Return values from queue > > @ libmysqld/CMakeLists.txt > > Add Bounded_queue > > @ libmysqld/Makefile.am.THIS > > Add Bounded_queue > > ??? some bzr hiccup? > > > @ mysql-test/include/order_by.inc > > Add new tests. > > @ mysql-test/include/select.inc > > Fix typo in bug number. > > @ mysql-test/r/order_by_none.result > > Added new tests. > > @ mysql-test/r/order_by_sortkey.result > > New test. > > @ mysql-test/r/select_none.result > > Fix typo in bug number. > > @ mysql-test/t/order_by_sortkey.test > > New test. > > @ mysys/CMakeLists.txt > > Compile standalone queues test executable. > > @ mysys/queues.c > > Fix un-maintained code: s/bool/my_bool/ > > @ sql/CMakeLists.txt > > Add Bounded_queue to gunit tests. > > @ sql/Makefile.am.THIS > > Add Bounded_queue > > ??? I'll be patching the diff into a different tree anyways, so this will go away. > > ... > > > + /** > > + Function for comparing two keys. > > + @param n Pointer to number of bytes to compare. > > + @param a First key. > > + @param b Second key. > > + @retval -1, 0, or 1 depending on whether the left argument is > > + less than, equal to, or greater than the right argument. > > + */ > > + typedef int (*compare_function)(size_t *n, Key_type **a, Key_type > **b); > > + > > + /** > > + Initialize the queue. > > + > > + @param max_elements The size of the queue. > > + @param offset_to_key Offset to key in elements stored in the > queue. > > Didn't we agree to drop offset_to_key? We didn't, but we do now. > > > + @param max_at_top Set to 1 if you want biggest element on top. > > + @param compare Compare function for elements, takes 3 > arguments. > > + If NULL, we use > get_ptr_compare(compare_length). > > + @param compare_length Length of the data (i.e. the keys) used > for sorting. > > + @param keymaker Function which generates keys for elements. > > + @param sort_param Sort parameters. > > + @param sort_keys Array of pointers to keys to sort. > > + > > + @retval 0 OK, 1 Could not allocate memory. > > + > > + We do *not* take ownership of any of the input pointer arguments. > > + */ > > + int init(ha_rows max_elements, uint offset_to_key, bool max_at_top, > > + compare_function compare, size_t compare_length, > > + keymaker_function keymaker, Sort_param *sort_param, > > + Key_type **sort_keys); > > + > > + /** > > + Pushes an element on the queue. > > + If the queue is already full, we discard one element. > > + Calls keymaker_function to generate a key for the element. > > + > > + @param element The element to be pushed. > > + */ > > + void push(Element_type *element); > > + > > + /** > > + Removes the top element from the queue. > > + > > + @retval Pointer to the (key of the) removed element. > > + */ > > + Key_type **pop() > > + { > > + // Don't return the extra element to the client code. > > + if (queue_is_full((&m_queue))) > > + queue_remove(&m_queue, 0); > > + return reinterpret_cast(queue_remove(&m_queue, 0)); > > + } > > I think we agreed to add a comment that said that pop() had been > implemented for testing purposes, and is not fully tested. > done. > ... > > > + /* > > + PQ has cost: > > + (insert + qsort) * log(queue size) / TIME_FOR_COMPARE_ROWID + > > + cost of file lookup afterwards. > > + */ > > + const double pq_cost= > > + (PQ_slowness * num_rows + param->num_keys_per_buffer) * > > + log(param->max_rows + 1.0) / TIME_FOR_COMPARE_ROWID + > > + param->max_rows * table->file->scan_time(); > > As discussed on Friday, param->num_keys_per_buffer == param->max_rows > + 1.0 > at this point. I think it would be more obvious if just one of these > variables are used in this formula. > OK > ... > > > + > > +/* > > + Some basic performance testing, to compute the overhead of > Bounded_queue. > > + Run the with 'bounded_queue-t --disable-tap-output' to see the > > + millisecond output from Google Test. > > + */ > > +const int num_rows= 10000; > > +const int row_limit= 100; > > +const int num_iterations= 1000; > > + > > +class PerfTestSmall : public ::testing::Test > > +{ > > +public: > > + /* > > + The extra overhead of malloc/free should be part of the > measurement, > > + so we do not define the key container as a member here. > > + */ > > + typedef Key_container Container; > > + enum { limit= row_limit }; > > +}; > > + > > +class PerfTestLarge : public ::testing::Test > > +{ > > +public: > > + /* > > + The extra overhead of malloc/free should be part of the > measurement, > > + so we do not define the key container as a member here. > > + */ > > + typedef Key_container Container; > > + enum { limit= num_rows }; > > +}; > > + > > + > > +template > > +void insert_and_sort() > > +{ > > + typedef Key_container Container; > > + for (int it= 0; it< num_iterations; ++it) > > + { > > + Container *keys= new Container; > > + srand(0); > > + Bounded_queue queue; > > + EXPECT_EQ(0, queue.init(limit, 0, true, int_ptr_compare, > > + sizeof(int),&int_keymaker, NULL, > keys->key_ptrs)); > > + for (int ix= 0; ix< limit; ++ix) > > Should not this loop do num_rows iterations? > Otherwise, the name of the constants are a bit misleading. > The way it is now, it should rather be num_rows_large instead of > num_rows and num_rows_small instead of row_limit. ah yes, thanks. > > > + { > > + int data= rand(); > > + queue.push(&data); > > + } > > + my_string_ptr_sort((uchar*)&keys->key_ptrs[0], > > + queue.num_elements(), sizeof(int)); > > + delete keys; > > + } > > +}