From: Tor Didriksen Date: December 2 2010 11:23am Subject: Re: bzr commit into mysql-next-mr-bugfixing branch (tor.didriksen:3236) WL#1393 List-Archive: http://lists.mysql.com/commits/125800 Message-Id: <4CF781A6.1080902@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit On 2010-11-26 16:14, Øystein Grøvlen wrote: > Hi, > > Thanks for addressing many of my comments. This mail contains my > responses to your reply. A separate mail will be sent with comments > to the lasted committed patch. (For some reason the indentation of the > original code is broken in this email, sorry about that.) > > -- > Øystein Hi Øysten. sorry for taking so long with the reply, this email got lost in my inbox. -- didrik > > > On 11/24/10 04:36 PM, Tor Didriksen wrote: > > On 2010-11-19 12:59, Øystein Grøvlen wrote: > >> Hi Tor, > >> > >> Thanks for the patch. > >> I am sorry that I have used a bit long time on this review. > >> > >> Most of my comments are in-line. In addition, I have some question on > >> testing: > >> > >> - Will the existence of indexes matter for this feature? Should > >> that be tested in some way? > > > > The optimizer tries to use indexes in order to avoid filesort. > > I know. My question was more whether it does matter whether the input > rows to filesort comes from an index scan based on other columns than > the columns being sorted on. I guess probably not. I have no idea why the optimizer decides to use a temporary table... I just optimize away the need for sort-merge if there is a LIMIT. > > > Yes, this filename is a bit misleading now that Bounded_queue has been > > templatized and moved to the .h file. > > So, should I create a new pair of .h and .cc files, just for the cost > > function? > > Why cannot the cost function go into filesort.h? That file seems to > import less than bounded_queue.h, so I would imagine that it should > still be possible to unit test it. As for a separate.cc file, you > already have one; it just has a misleading name. ;-) > non-template functions should not go in header files (except for very small inline functions). I could tag it 'static' but then we would get one instance in each compilation unit that does #include "filesort.h" > >> > +################ > >> > +## Test with subqueries > >> > +SELECT d1.* FROM t1 > >> > +LEFT JOIN (SELECT * FROM t1 LIMIT 30) d1 on t1.f1=d1.f1 > >> > +ORDER BY d1.f2 DESC LIMIT 30; > >> > >> I do not think the result of the above query is well defined since the > >> contents of the derived table, d1, is not explicitly ordered. > > > > ??? > > My point is that since there is no order by in query "SELECT * FROM t1 > LIMIT 30", any 30 rows may be in the derived table to be left joined > with t1. Hence, in general, the result is correct as long as it > contains 30 sorted rows that is either from t1 or all NULLs. On the > other hand, given a sensible implementation for the derived table, > this will probably never change. I think I re-wrote that query. Does it look better now? > > > > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > +INSERT INTO tmp SELECT f1,f2 FROM t1; > >> > +INSERT INTO t1(f1,f2) SELECT * FROM tmp; > >> > + > >> > +FLUSH STATUS; > >> > +SHOW SESSION STATUS LIKE 'Sort%'; > >> > + > >> > +SELECT * FROM t1 ORDER BY f2 LIMIT 100; > >> > >> I notice that if I run this query without your new feature, I get a > >> different ordering of rows, but maybe this is unavoidable. In order > >> to test what you are actually testing, you will need to select columns > >> that are not ordered? > > > > ??? > > > > I am just saying that this query will now have a different ordering of > result rows than before. This is not a big issue, I guess. And if > the test is to have addon data, i guess one cannot define a total > ordering of rows. I actually get the same result, but the test fails with -Sort_merge_passes 0 +Sort_merge_passes 379 If the result is actually different without PQ, that would be good? If something changes, so PQ is disabled for the query I mean. What's important here is that the output is deterministic, with PQ, across platforms. > > > > > > >> > >> > + > >> > + /* Simulate behavior of merge_many_buff(). */ > >> > + while (num_buffers>= MERGEBUFF2) > >> > + { > >> > + /* Calculate # of calls to merge_buffers() */ > >> > + ha_rows loop_limit= num_buffers - MERGEBUFF*3/2; > >> > >> loop_limit is a bit too general to me. Would it be possible to come > >> up with a name that better describes what it represents? > > > > Compare with the function we are simulating: merge_many_buff() > > for (i=0 ; i <= *maxbuffer-MERGEBUFF*3/2 ; i+=MERGEBUFF) > > ^^^^^^^^^^^^^^^^^^^^^^^^ > > In my comments to the last commit, my suggestion is to eliminate > loop_limit as a separate constant. My preference would be to keep it. > > > > >> > >> > + ha_rows num_merge_calls= 1 + loop_limit/MERGEBUFF; > >> > + ha_rows i_end_value= num_merge_calls * MERGEBUFF; > >> > + > >> > + /* Cost of merge sort 'num_merge_calls' */ > >> > + total_cost+= num_merge_calls * > >> > + ( 2.0 * (max_n_elems * MERGEBUFF_double * elem_size) / IO_SIZE > >> > + + max_n_elems * MERGEBUFF_double * log(MERGEBUFF_double) / > >> > + (TIME_FOR_COMPARE_ROWID * M_LN2)); > >> > + > >> > + /* # of records in last buffer */ > >> > + last_n_elems+= (num_buffers - i_end_value) * max_n_elems; > >> > + > >> > + /* cost of merge sort of last buffer */ > >> > + total_cost+= 2.0 * ((double)last_n_elems * elem_size) / IO_SIZE > >> > + + double(last_n_elems) * log(MERGEBUFF_double) / > >> > + (TIME_FOR_COMPARE_ROWID * M_LN2); > >> > + > >> > + num_buffers= num_merge_calls; > >> > + max_n_elems*= MERGEBUFF; > >> > + } > >> > + > >> > >> It seems to me that it should be possible to avoid this loop, but it > >> will probably require some complex expressions involving logarithms to > >> get the exact same estimate. But how precise do we really need to > >> be here? In the end, we are going to compare this to a number that > >> are based on pretty coarse heuristics (PQ_slowness). > >> > >> Will it not be possible to come up with a good-enough estimate that > >> involves a bit less computation? > > > > I guess it is possible. > > The number of iterations through this loop is < 20, > > so it doesn't cost much to keep the "simulate the real sort" approach. > > > > AFAICT, the number of iterations will be about > log_MERGEBUFF(num_rows). Yes, that is roughly correct, sometimes off by one. Does this fact help us? > > > > >> > >> > + /* Simulate final merge_buff call. */ > >> > + last_n_elems+= max_n_elems*num_buffers; > >> > + total_cost+= 2.0 * ((double)last_n_elems * elem_size) / IO_SIZE > >> > + + double(last_n_elems) * log(num_buffers + 1.0) / > >> > + (TIME_FOR_COMPARE_ROWID * M_LN2); > >> > >> The same formula seems to be used several times here. I think it > >> should be factored into a separate function. > > > > OK > > > > > >> > >> > +/** > >> > + A priority queue with a fixed, limited size. > >> > + > >> > + This is a wrapper on top of QUEUE and the queue_xxx() functions. > >> > + It keeps the top N elements which are inserted. > >> > >> I think it should be explained a bit what is meant by top N here. > >> I also think it should be described what Element_type and Key_type > >> represents, and how they relate. > > > > OK, tried to improve documentation. > > Good improvements. As I said somewhere else, I think the trick about > using one extra element and always replacing the lowest should also be > explained. Added a blurb at the class definition. > > >> > + > >> > + @retval Pointer to the removed element. > >> > >> But what is returned is Key_type, not Element_Type. > > > > Tried to clarify. > > > > I still do not get this. One pushes an element, but only gets back > a reference the key. How will one be able to retrieve the actual > element? You can copy the element into the key, or keep a pointer to it (see unit tests). You can also let Key_type == Element_type, and use the offset_to_key I guess. This is entirely up to the keymaker_function. > >> What is the unit for this cost value? > > > > furlongs per fortnight > > Hehe. It think the intention is to be proportional to the number of > disk accesses. > OK > > > > > >> > >> > + > >> > + @note > >> > + Declared here in order to be able to unit test it, > >> > + since library dependencies have not been sorted out yet. > >> > +*/ > >> > +double get_merge_many_buffs_cost_fast(ha_rows num_buffers, ha_rows > >> max_n_elems, > >> > + ha_rows last_n_elems, uint elem_size); > >> > + > >> > +#endif // BOUNDED_QUEUE_INCLUDED > >> > > >> > === modified file 'sql/filesort.cc' > >> > --- a/sql/filesort.cc 2010-08-04 10:34:01 +0000 > >> > +++ b/sql/filesort.cc 2010-10-21 14:12:43 +0000 > >> > @@ -32,11 +32,27 @@ > >> > #include "probes_mysql.h" > >> > #include "sql_test.h" // TEST_filesort > >> > #include "opt_range.h" // SQL_SELECT > >> > +#include "bounded_queue.h" > >> > +#include "sql_select.h" > >> > > >> > #ifndef THREAD > >> > #define SKIP_DBUG_IN_FILESORT > >> > #endif > >> > > >> > +/* > >> > + How much Priority Queue sort is slower than qsort. > >> > + For qsort we have an average case complexity of O(n*log(n)) > >> comparisons. > >> > + When using a priority queue, we have log(n) comparisons each > time we > >> > + push a new element. For PQ we also call qsort at the end. > >> > >> But this qsort at the end is for a smaller number of elements so I do > >> not follow the logic here. > > > > That depends on the number of records in the table, and the LIMIT. > > If num_records == limit, I would expect PQ to be roughly twice as > > expensive. > > First n*log(n) comparisons when inserting, then n*log(n) comparisons > > for qsort. > > > > Seems like my comment is only confusing. > > Should I simply say 3.0 "based on measurements"? > > Yes, I might as well do that. OK > >> > >> What is the point in having a generic options parameter when there is > >> only one option. Why not just use a bool for this? > > > > Much easier to read at the calling site if it is a bitmask or enum, > > if it is a bool, needs a comment. > > > > Seems like an argument for never using bool parameters ... Point is obsolete. > > > > > > >> > >> > + } > >> > } > >> > - sort_keys= table_sort.sort_keys; > >> > - if (memavl< min_sort_memory) > >> > + else > >> > + DBUG_PRINT("info", ("filesort PQ is not applicable")); > >> > >> This will not catch the fall-back to merge-sort case. Maybe it should > >> be moved to the if-block below. > > > > "filesort PQ is not applicable" > > is not the same as > > "filesort is appliccable, but we failed to allocate enough memory" > > I agree. Will there be printed some debug information in the latter > case? good point. > > > > > > >> > >> > + > >> > + if (!table_sort.sort_keys) > >> > >> This could have been made into an else-statement if there was no > >> fall-back. Would make things simpler. > > > > Would you prefer "sort aborted" if we fail to allocate memory for the > > QUEUE? > > I am not sure. I just wonder if we fail to allocate the necessary > memory here, how likely is it that we will get the necessary memory > needed for merge-sort? Is it worth the effort to try to recover? Yes, I think so. MIN_SORT_MEMORY == 32*1024 It is better to fallback to sort-merge, than to abort the query (PQ cannot be disabled by the user) > > >> > >> I think things would be a bit more readable if the above line was done > >> first in a separate statement. > > > > you mean a local variable to store the result of this expression ?: > > pq.is_initialized() ? &pq : NULL, > > > > Yes. Not very important, but I found the statement a bit difficult to > read, the way it is now. > I think this issue is resolved already. > > > > >> > >> > + > >> > + if (options& OPTION_FOUND_ROWS) > >> > + num_rows= found_rows; > >> > + else if (num_rows> param.max_rows) > >> > + num_rows= param.max_rows; > >> > >> I think a comment explaining the significance of the above would be > >> good. > > > > hmm, good question. > > I'll get back to you on that one. > > > > OK This has been cleaned up now. > > > >> > > >> > > >> > +/** > >> > + Test whether priority queue is worth using to get top elements > of an > >> > + ordered result set. If it is then allocates buffer for required > >> amount of > >> > >> I suggest: "If it is, ..." > > > > of course > > Not updated in latest commit. will try again > > > > > > >> > >> It is not clear to me, why it is a good idea to do the allocation of > >> the buffer within this function. It seems it leads to unnecessary > >> duplication of calls to make_char_array(). > >> > >> > > > > It isn't necessarily allocated here, on second execution it is only > > re-initialized, introduced here: > > Fixed bug #28811: crash for a query containing a subquery with > > ORDER BY and LIMIT 1. > > I am not sure that is an argument for where the call should be made. we cannot do it up-front, since check_if_pq_applicable knows about the space required. we cannot do it afterwards, since we may have removed the addon stuff. anyways: I think the simplification of the make_char_array interface has improved things a bit. > > > > > We also take extra actions when "stripping off addon fields", > > which depend on the successful allocation. > > That could be done in a separate function, but I guess I am not > insisting. Another alternative is to do the make_char_array() once at > the end of the function. > > > > > If we fail to allocate for PQ, we fall back to the loop in filesort: > > while (memavl >= min_sort_memory) > > { > > try to allocate > > if failed to allocate, decrease memavl. > > } > > > > AH, that made me understand the point about the fall-back strategy. > Merge-sort will try with much less memory. > > However, the first fallback strategy could also be to try again with > PQ without addon-data. I think that is more according to the > documentation which now says: > If we don't have space for records, but we *do* have > space for keys, we may rewrite 'table' to sort with > references to records instead of additional data. > > hmm, this sounds complicated to me. how to test it? > >> > >> Can this happen? > > > > > > Yes, if there is no LIMIT clause, param->max_rows == HA_POS_ERROR. > > > > So this method is called also when there is no limit clause? yes > > >> > >> > + } > >> > + > >> > + if (param->max_rows + 2>= UINT_MAX) > >> > + { > >> > + DBUG_PRINT("info", ("Too large LIMIT")); > >> > >> Please, use a more specific text here. > > > > Why? this is within check_if_pq_applicable, so the trace will be > > > > T@4 : | | | | | | | | | >check_if_pq_applicable > > T@4 : | | | | | | | | | | info: Too large LIMIT > > T@4 : | | | | | | | | | > > > Makes kind of sense. However, it is possible to set up the debug > system in order to get info, but not trace. > > > > >> > >> > + DBUG_RETURN(NULL); > >> > + } > >> > + > >> > + uchar **sort_keys; > >> > >> In this context, it will not > >> > >> Seems to me that the code below would be simpler if if you initialized > >> this to NULL, and just had a single return at the end. > >> > > > > this is now rewritten, please have another look. > > Looks OK. > > > > > > >> > + ulong num_available_keys= memavl/(param->rec_length + > sizeof(char*)); > >> > + param->num_keys_per_buffer= (uint) param->max_rows + 1; > >> > + if (num_rows< num_available_keys) > >> > + { > >> > + /* The whole source set fits into memory. */ > >> > + if (param->max_rows< num_rows/PQ_slowness ) > >> > >> Some comment on PQ_slowness would be good here. > > > > There is already six lines describing PQ_slowness. > > There's > > else > > { > > /* PQ will be slower */ > > DBUG_RETURN(false); > > } > > > > I don't know what more to say... > > I am still not convinced that the two usages of PQ_slowness actually > represents the same constant. What make PQ_slowness applicable in > both contexts? AFAICT, the value of PQ_slowness is based on scenarios > where only one of the formulas apply. please elaborate. > > > >> > + param->sort_length + param->ref_length + sizeof(char*); > >> > >> I would like a comment explaining what these 3 parts of the row > >> represents. > > > > See Sort_param and make_char_array (which also adds sizeof(pointer)). > > > > What will this extra pointer refer to? seems to be pointers to records within the array. > > >> > >> num keys or max keys? Name and description does not match. > > Did you intentionally ignore this comment? yes, and no. renamed. > > >> > >> It seems a bit inconsistent to introduce an init-function for > >> Sort_param and only use it in 1 of 2 places where sort_param is used. > >> > > > > It was actually introduced to simplify explain_filesort(), > > which was rejected by architect review. > > Is it possible to use init here, too? > not really appliccable, Unique seems to use Sort_param in a totally different way. I guess it should be called init_for_filesort() > >> I do not think this "not interesting" aspect of Bounded_queue is > >> documented. Seems a bit awkward. > > > > you are right, that should be handled by Bounded_queue itself, rather > > than the client code. > > > > Thanks, for fixing that. However, I think that change requires a test > that verifies that interleaved pushing and popping works correctly. > pop() isn't really necessary for this worklog, it's mainly to understand the underlying implementation. the interleaved scenario is not really relevant now. do you still instist on another unit test? > > > >> > >> > + */ > >> > + (void) m_queue.pop(); > >> > + int expected_key_val= m_queue.num_elements() - 1; > >> > + while (m_queue.num_elements()> 0) > >> > + { > >> > + Test_key **top= m_queue.pop(); > >> > + int key_val= (*top)->key; > >> > + EXPECT_EQ(expected_key_val, key_val); > >> > + Test_element *element= (*top)->element; > >> > + EXPECT_EQ(expected_key_val, element->val); > >> > + --expected_key_val; > >> > + } > >> > +} > >> > + > >> > + > >> > +/* > >> > + Verifies that push, with bounded size, followed by sort() works. > >> > + */ > >> > +TEST_F(BoundedQueueTest, insert_and_sort) > >> > +{ > >> > + EXPECT_EQ(0, m_queue.init(num_elements/2, 0, TRUE, > test_key_compare, > >> > + m_key_size, > >> > +&test_keymaker, NULL, m_keys.key_ptrs)); > >> > + for (int ix= 0; ix< array_size(m_test_data); ++ix) > >> > + { > >> > + m_queue.push(&m_test_data[ix]); > >> > + } > >> > + > >> > + uchar *base= (uchar*)&m_keys.key_ptrs[0]; > >> > + uint items= m_queue.num_elements(); > >> > + size_t size= sizeof(Test_key); > >> > + // We sort our keys as strings, so erase all the element pointers > >> first. > >> > + for (int ii= 0; ii< array_size(m_keys.key_data); ++ii) > >> > + m_keys.key_data[ii].element= NULL; > >> > + > >> > + my_string_ptr_sort(base, items, size); > >> > + for (int ii= 0; ii< num_elements/2; ++ii) > >> > + { > >> > + Test_key *sorted_key= m_keys.key_ptrs[ii]; > >> > + EXPECT_EQ(ii, sorted_key->key); > >> > + } > >> > +} > >> > + > >> > + > >> > +/* > >> > + A test of the function get_merge_many_buffs_cost_fast() > >> > + */ > >> > +TEST(CostEstimationTest, merge_many_buff) > >> > +{ > >> > + ha_rows num_rows= 512; > >> > + ulong num_keys= 100; > >> > + ulong row_lenght= 100; > >> > + double prev_cost= 0.0; > >> > + while (num_rows<= MAX_FILE_SIZE/4) > >> > + { > >> > + double merge_cost= > >> > + get_merge_many_buffs_cost_fast(num_rows / num_keys, > >> > + num_keys, > >> > + num_rows % num_keys, > >> > + row_lenght); > >> > + EXPECT_LT(0.0, merge_cost); > >> > + EXPECT_LT(prev_cost, merge_cost); > >> > + num_rows*= 2; > >> > + prev_cost= merge_cost; > >> > + } > >> > +} > >> > + > >> > >> It seems that this just tests that the cost is positive and > >> monotonically increasing. Would it be possible in some other way to > >> test that the computed cost is actually reasonable? > > > > This started out as an over/undeflow test (got NaN for certain input). > > Then as a test that it was actually "fast" (the inner loop was very > slow). > > I really don't know "reasonable cost" is, > > and it's hard to test float accuracy anyways ... > > > > I guess unit tests is not the most appropriate for such testing. But > it would be good to have some way of verify our cost functions. I think unit testing *should be* appropriate. I am now testing: - no underflow/overflow - speed (not measured, just observed manually) - cost is monotonically increasing as a function of number of rows. anything other you can think of would be appreciated. > > Maybe it is better not to know. In my view, anything relying on > TIME_FOR_COMPARE is probably not very accurate. ;-) indeed > > > > > > >> > >> > + > >> > +/* > >> > + Comparison function for integers. > >> > + */ > >> > +extern "C" > >> > >> Also gives warning. > > > > gone > > > > > >> > >> > +int int_ptr_compare(size_t *cmp_arg, int **a, int **b) > >> > +{ > >> > + EXPECT_EQ(*cmp_arg, sizeof(int)); > >> > + > >> > + int a_num= **a; > >> > + int b_num= **b; > >> > + > >> > + if (a_num> b_num) > >> > + return +1; > >> > + if (a_num< b_num) > >> > + return -1; > >> > + return 0; > >> > +} > >> > + > >> > + > >> > +/* > >> > + Generates an integer key for a given integer element. > >> > + */ > >> > +void int_keymaker(Sort_param *sp, int *to, int *from) > >> > +{ > >> > + memcpy(to, from, sizeof(int)); > >> > +} > >> > + > >> > + > >> > +/* > >> > + Some basic performance testing, to compute the overhead of > >> Bounded_queue. > >> > + Run the with 'bounded_queue-t --disable-tap-output' to see the > >> > + millisecond output from Google Test. > >> > + */ > >> > >> I wonder whether it would make sense to have a separate file for this > >> performance test. > >> > > > > I don't see any point in having another executable for that? > > > > I guess you are right. > > > > >> > +const int num_rows= 10000; > >> > +const int row_limit= 100; > >> > +const int num_iterations= 1000; > >> > + > >> > +class PerfTestSmall : public ::testing::Test > >> > +{ > >> > +public: > >> > + /* > >> > + The extra overhead of malloc/free should be part of the > measurement, > >> > + so we do not define the key container as a member here. > >> > + */ > >> > + typedef Key_container Container; > >> > + enum { limit= row_limit }; > >> > +}; > >> > + > >> > +class PerfTestLarge : public ::testing::Test > >> > +{ > >> > +public: > >> > + /* > >> > + The extra overhead of malloc/free should be part of the > measurement, > >> > + so we do not define the key container as a member here. > >> > + */ > >> > + typedef Key_container Container; > >> > + enum { limit= num_rows }; > >> > +}; > >> > + > >> > + > >> > +/* > >> > + Test with Bounded_queue size ==. > >> > + */ > >> > +TEST_F(PerfTestSmall, insert_and_sort) > >> > +{ > >> > + for (int it= 0; it< num_iterations; ++it) > >> > + { > >> > + Container *keys= new Container; > >> > + srand(0); > >> > + Bounded_queue queue; > >> > + EXPECT_EQ(0, queue.init(limit, 0, TRUE, int_ptr_compare, > >> > + sizeof(int),&int_keymaker, NULL, keys->key_ptrs)); > >> > + for (int ix= 0; ix< limit; ++ix) > >> > + { > >> > + int data= rand(); > >> > + queue.push(&data); > >> > + } > >> > + my_string_ptr_sort((uchar*)&keys->key_ptrs[0], > >> > + queue.num_elements(), sizeof(int)); > >> > + delete keys; > >> > + } > >> > +} > >> > + > >> > + > >> > +/* > >> > + Test with Bounded_queue size == > >> > + */ > >> > +TEST_F(PerfTestLarge, insert_and_sort) > >> > +{ > >> > + for (int it= 0; it< num_iterations; ++it) > >> > + { > >> > + Container *keys= new Container; > >> > + srand(0); > >> > + Bounded_queue queue; > >> > + EXPECT_EQ(0, queue.init(limit, 0, TRUE, int_ptr_compare, > >> > + sizeof(int),&int_keymaker, NULL, keys->key_ptrs)); > >> > + for (int ix= 0; ix< limit; ++ix) > >> > + { > >> > + int data= rand(); > >> > + queue.push(&data); > >> > + } > >> > + my_string_ptr_sort((uchar*)&keys->key_ptrs[0], > >> > + queue.num_elements(), sizeof(int)); > >> > + delete keys; > >> > + } > >> > +} > >> > + > >> > >> You have two identical code parts above. Maybe this could be made > >> into a separate function? > > > > It would need to be a templatized function. > > > > done > > > > OK. I am not quite sure I understand the value of templatizing on > limit, but it is OK with me. > we allocate arrays, which should be fixed size. > >> > >> > + > >> > +/* > >> > + Test without bounded queue, i.e. insert keys into array, and > sort it. > >> > + */ > >> > +TEST_F(PerfTestLarge, without_queue) > >> > +{ > >> > + for (int it= 0; it< num_iterations; ++it) > >> > + { > >> > + Container *keys= new Container; > >> > + srand(0); > >> > + for (int ix= 0; ix< limit; ++ix) > >> > + { > >> > + int data= rand(); > >> > + keys->key_data[ix]= data; > >> > + } > >> > + my_string_ptr_sort((uchar*)&keys->key_ptrs[0], limit, > sizeof(int)); > >> > + delete keys; > >> > + } > >> > +} > >> > + > >> > + > >> > +/* > >> > + Computes the overhead of setting up sort arrays, and rand() calls. > >> > + */ > >> > +TEST_F(PerfTestLarge, no_sorting) > >> > +{ > >> > + for (int it= 0; it< num_iterations; ++it) > >> > + { > >> > + Container *keys= new Container; > >> > + srand(0); > >> > + for (int ix= 0; ix< limit; ++ix) > >> > + { > >> > + int data= rand(); > >> > + keys->key_data[ix]= data; > >> > + } > >> > + delete keys; > >> > + } > >> > +} > >> > + > >> > +} // namespace > >> > > >> > > >> > > >> > > >> > > >> > > > >