Thanks Sergei, responses below:
On Feb 19, 2009, at 2:33 AM, Sergei Golubchik wrote:
> Hi, Eric!
>
> On Feb 18, Eric Jensen wrote:
>> Sergei,
>>
>> I do still want to implement this, although I haven't had time to
>> dedicate
>> to it recently. If you could shed any light on my unanswered
>> questions
>> that would be great.
>
> yes, I just want to understand the solution you propose (it'd be a
> pity
> to spend a lot of time and efforts implementing something, only to
> find
> out that it doesn't work as expected).
>
>> Interleaving results of unordered component selects was just a
>> proof of
>> concept to show myself there wouldn't be any non-starter issues with
>> keeping multiple selects "open" within a single query. It had the
>> added
>> bonus of me being able to run most of the test suite against those
>> changes.
>
> Ah, okay. I see.
>
>> The reason to do a merge is the case where we want the entire union
>> sorted
>> equivalently to the component selects, especially when we only want
>> the
>> first n, i.e.:
>
> Exactly.
> But note that even when component selects are not explicitly
> ordered, in
> the sense that there is no ORDER BY clause, you can still tell MySQL
> to
> order them as it'll enable you to do the k-way merge. Ultimately, it
> depends on the cost of sorting vs. the cost of using a temp table, and
> it's up to the optimizer to make the best choice.
>
>> (select blah from foo where x order by z limit n
>> union
>> select blah from foo where y order by z limit n) order by z limit n
>>
>> You are of course correct that the optimal solution is not to use a
>> temporary table, but to do a proper k-way merge either with pairwise
>> comparisons or a heap of the top elements for each of the k
>> components. Either of those are O(n log k), as compared to the O(kn
>> log kn) sort mysql currently does for the global order by. Even if
>> we
>> can achieve an order-by-index optimization on the components, mysql
>> can only do a filesort for the global order by.
>
> order by index isn't necessarily better than a filesort, as it often
> involves much more random I/O, while filesort tries to access the disk
> sequentially.
k-way merge is most interesting for the order-by limit cases where the
limit is much smaller than the potential number of rows we'd have to
sort. those cases also happen to be the ones where order by index is
generally better than a filesort.
>
>
>> The current implementation reads all the component selects into a
>> temporary table and then runs an internal select against that to
>> produce the result set.
>
> Current *your* implementation ?
> If yes - how is it different from what a vanilla MySQL is doing ?
No, the current mysql implementation.
>
>
>> This is somewhat convenient, as that internal select just has the
>> parsed global order-by from the query. Doing a proper merge would
>> require the ability to perform the comparisons related to that
>> order-by in code, as opposed to just applying the sort like any other
>> select. It would also still require some way of exposing the results
>> of the merge as a select_result (or whatever those are
>> called...forgetting), which is done as a temporary table everywhere
>> else we have intermediate results anyways. Both of these can be
>> overcome, but I think the best plan is to do something simpler for my
>> first version of this.
>>
>> My plan for using a temporary table with a btree index to do the
>> top-n
>> sorting avoids these difficulties, at the expense of building that
>> index across all results in that temporary table versus popping them
>> off of a heap which limits its size to k. In the average case, that
>> is still much better than doing an O(kn log kn) sort.
>
> You want to create a temporary table with a btree index, put all union
> data into it, and then retrieve rows in the index order ?
No, not all union data, only the rows necessary for performing the
merge, using the index to select the next top value that determines
which component select to read a new row from next. My initial design
(http://lists.mysql.com/internals/36113) went into this in more
detail, however it's shifted somewhat since I'm not using cursors in
step 3 at all as described there. But my plan for the temporary table
part of it remains the same (pasting):
k-way merge design
------------------
1. when we determine all selects within the union have the same order
by field as the outer one
2. create temporary table with a btree index on that field and an extra
int select_num field specifying which select the result came from
3. replace st_select_lex_unit::exec logic so that instead of
JOIN::exec's that read every result from each of those queries,
figure out how to run them for just one result at a time
4. make st_select_lex_unit::exec use k-way merge logic relying on this
one-result-at-a-time fetching to retrive up to LIMIT n results
(see http://www.site.uottawa.ca/~nat/Courses/DFS-Course/DFS-Lecture-9/tsld013.htm)
,
treating our temp table with its ordered index as a heap
- This means, for every iteration of the merge loop, programatically
select union_key1, union_key2, ..., select_idx from tmp table that
uses WHERE union_key > X order by union_key1, union_key2, .... limit 1
5. select all but special select_num field ordered by the global order
field out to client
My interleaving proof of concept effectively implements step 3. All
of the others I still need to figure out how to do in mysql, as per
http://lists.mysql.com/internals/36138
>
>
> Regards / Mit vielen Grüßen,
> Sergei
>
> --
> __ ___ ___ ____ __
> / |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
> / /|_/ / // /\ \/ /_/ / /__ Principal Software Engineer/Server
> Architect
> /_/ /_/\_, /___/\___\_\___/ Sun Microsystems GmbH, HRB München
> 161028
> <___/ Sonnenallee 1, 85551 Kirchheim-
> Heimstetten
> Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Dr. Roland Boemer
> Vorsitzender des Aufsichtsrates: Martin Häring
>
> --
> MySQL Internals Mailing List
> For list archives: http://lists.mysql.com/internals
> To unsubscribe: http://lists.mysql.com/internals?
> unsub=ej@stripped
>