List:Internals« Previous MessageNext Message »
From:Øystein Grøvlen Date:November 25 2009 2:16pm
Subject:Re: Blobs and sub-query materialization
View as plain text  
Thank you Timour for explaining this to me. I am sorry that I not only
managed to confuse myself, but also you by talking about sub-query
materialization when what I really meant was semi-join materialization.

Given bug#48213, it seems that semi-join materialization does not
take the necessary actions to prevent materialization from being used
with blob columns.

It still confuses me a bit that temporary tables with blobs are
created in the first place.  Why does create_tmp_table() need to
handle blob columns?  Are there scenarios where such tables are of use
even if you are not able to perform look-ups on them?

--
Øystein


Timour Katchaounov wrote:
 > Øystein,
 >
 >> Hi,
 >>
 >> Is sub-query materialization supposed to work in the context of BLOB
 >> columns?  Given the following query (ref. Bug#48213) where column e is
 >
 > No, it is not supposed to work, and this is clearly encoded and commented
 > in the implementation of subquery materialization. There are also 
explicit
 > tests for this.
 >
 > In item_subselect.cc, method subselect_hash_sj_engine::init_permanent
 >
 > there is a comment:
 >
 >   /*
 >      If the subquery has blobs, or the total key lenght is bigger 
than some
 >      length, then the created index cannot be used for lookups and we
 >      can't use hash semi join. If this is the case, delete the temporary
 >      table since it will not be used, and tell the caller we failed to
 >      initialize the engine.
 >   */
 >
 > In addition, in the regression test
 >   subselect_mat.test
 > there is a set of tests that make sure that we do *not* use 
materialization
 > in the presence of blobs and long text columns. The tests follow 
after the
 > comment:
 >
 > /*
 >   Test that BLOBs are not materialized (except when arguments of some
 > functions).
 > */
 >
 > The reason for this limitation is that unique indexes created on BLOB
 > and long
 > (AFAIR > 512 bytes) columns can be used only to ensure uniqueness, but
 > cannot
 > be used for lookup of actual values.
 >
 > As I was told by Monty and Serg, this can be fixed, but my goal was to
 > implement
 > materialization of subqueries, not to get stuck on some handler code, 
so I
 > preferred to accept this limitation.
 >
 >> BLOB and the rest is VARCHAR:
 >>
 >> SELECT pk FROM t1 WHERE (b,c,d,e)
 >>   IN (SELECT b,c,d,e FROM t2 WHERE pk > 0);
 >>
 >> Should it be possible to use materialization to evaluate this query?
 >
 > No, as is obvious from the code, and the comments in the code.
 >
 >> create_tmp_table(), which is used to create the materialized table,
 >> contains the following comment:
 >>
 >>       /*
 >>         Special mode for index creation in MyISAM used to support unique
 >>         indexes on blobs with arbitrary length. Such indexes cannot be
 >>         used for lookups.
 >>       */
 >>
 >> If this index cannot be used for lookup, how can it be of any use for
 >> materialization?
 >
 > Exactly, it cannot, and it isn't.
 >
 >>
 >> I am a bit confused,
 >>
 >
 > I don't understand why you were confused. The behavior is pretty
 > consistent,
 > and the comments pretty explicit. There are even tests for this behavior.
 >
 >
 > Timour
 >

Thread
Blobs and sub-query materializationØystein Grøvlen23 Nov
  • Re: Blobs and sub-query materializationJay Pipes23 Nov
    • Re: Blobs and sub-query materializationRoy Lyseng23 Nov
      • Re: Blobs and sub-query materializationJay Pipes23 Nov
  • Re: Blobs and sub-query materializationTimour Katchaounov25 Nov
    • Re: Blobs and sub-query materializationØystein Grøvlen25 Nov
  • Re: Blobs and sub-query materializationTimour Katchaounov25 Nov