List:General Discussion« Previous MessageNext Message »
From:Lucio Chiappetti Date:June 11 2008 2:28pm
Subject:Re: optimizing UNIONs ?
View as plain text  
On Tue, 10 Jun 2008, Martin wrote:

> Lucio
> So the net effect is to produce a cartesian join where ALL results from 
> Query1 (are combined with) ALL results from Query2
> In order to prevent cartesian join can you use a more narrowly defined 
> predicate such as what is defined at

   Thank you Martin for the suggestion to take the join out of the union.

   It is not exactly what suggestedin the URL you quote (that's more "take
   the WHERE condition inside a subquery") but gave me an idea which I
   explain below

   In my case the WHERE condition is totally unpredictable (all the stuff
   below will actually be masked under a servlet which users will access
   to construct queries)

> Are you sure it is wise to create temptable considering MERGE would be 
> disabled when temptable is specified?

   Apparently it was proven in the past that it was necessary to obtain
   a speed achievement in some conditions specific of my case.

> Molte Grazie
> Martin

   Prego ! (you are welcome)
   But why are you thanking me, while I should be thanking you ? :-)

(this reminds me of the 8th scene of Petrolini's Nerone  :-)
It's even on You Tube, google for "Petrolini Nerone grazie")

Now back to serious business

a) when I create views of the form

    gct left join member1... left join member2... left join membern...

    where the "member" tables are real tables
    the queries are fast and use the indices on member1 ... membern

b) if for maintenance purposes I create unions which concatenate
    three tables, say

    member1 (combo1) is union of tab1A tab1B tab1C
    member2 (combo2) is union of tab2A tab2B tab2C
    member3 (combo3) is union of tab3A tab3B tab3C

    while the union themselves are fast, the view above where each
    member is an union is 250 times slower !

c) it is not a matter of join optimization (STRAIGHT_JOIN does NOT

d) I found no way to force the union to use the concatenation of
    the index. I tried index hints like

    (select * from tab1A force index (auxiliary)) union
    (select * from tab1B force index (auxiliary)) union
    (select * from tab1C force index (auxiliary))

    but this has no effect.

e) one obvious way out would be to store the unions into a physical
    table,  so member1/2/3 will be physical tables.
    This is almost as fast as the single table query (scales with size)

    but has the disadvantage to waste disk space (and to require one
    remembers to update the physical union when one of the A B C components
    are updated

f) the solution is to write the view as an union of joins (instead
    of a join of union views)

  create view xxx as
   (select .. gct left join tab1A.. left join tab2A.. left join tab3A..)
   (select .. gct left join tab1B.. left join tab2B.. left join tab3B..)
   (select .. gct left join tab1C.. left join tab2C.. left join tab3C..)

  This proves to be successful. Explain select returns a manageable
  query which uses the indices on the tabnX, and executes only 2.5 slower
  than the original query on a single table ...

  ... what's more important it remains fast even if one is accessing
  element in each of the three union "chunks"

  The updated notes on
  are probably clearer

  One can even dispense with the "combo" unions, everything is
  done on the fly from the physical tabnX.

  The only thing is that writing the definition of the union of joins
  is slightly painful (but can be automatized).

Lucio Chiappetti - INAF/IASF - via Bassini 15 - I-20133 Milano (Italy)
For more info :
Multi pertransibunt et augebitur scientia
              Francis Bacon Instauratio Magna (
optimizing UNIONs ?Lucio Chiappetti10 Jun
  • using Unix soft linksLucio Chiappetti9 Feb
    • Re: using Unix soft linksJohan De Meersman10 Feb
Re: optimizing UNIONs ?Lucio Chiappetti11 Jun