List:Commits« Previous MessageNext Message »
From:Martin Hansson Date:January 18 2011 9:11am
Subject:Re: bzr commit into mysql-next-mr-opt-team branch (martin.hansson:3225)
WL#3724
View as plain text  
Roy Lyseng skrev 2010-12-21 11.12:
> Regarding tests, I would like to see your new test cases run with and 
> without join buffering enabled. I am not sure that a new test case 
> file is needed, but then again I am unsure which file the new test 
> cases would go into - join.test would be a candidate, but then one 
> should make a version that is run with jcl=0 and another with jcl=1.
I already change join_cache_level to appropriate values whenever the 
test mandates.
>
> On 10/6/10 5:22 PM, Martin Hansson wrote:
>> #Atfile:///data0/martin/bzrroot/wl3724/n-mr-o-t-Roy/  based 
>> onrevid:tor.didriksen@stripped
>>
>>   3225 Martin Hansson    2010-10-06
>>        WL#3724: Short-Cutting Join Execution: Speeding up star queries
>>
>>      added:
>>        mysql-test/r/shortcut.result
>>        mysql-test/t/shortcut.test
>>        sql/sql_set.h
>>        sql/sql_shortcut.h
>>      modified:
>>        mysql-test/r/func_in_none.result
>>        mysql-test/r/greedy_optimizer.result
>>        mysql-test/r/join.result
>>        mysql-test/r/join_cache_jcl1.result
>>        mysql-test/r/join_nested.result
>>        mysql-test/r/order_by_none.result
>>        mysql-test/r/subselect_innodb.result
>>        mysql-test/t/subselect_innodb.test
>>        sql/item.cc
>>        sql/item.h
>>        sql/records.h
>>        sql/sql_select.cc
>>        sql/sql_select.h
>>        sql/structs.h
>>
>> === modified file 'sql/item.cc'
>> --- a/sql/item.cc    2010-09-28 15:17:29 +0000
>> +++ b/sql/item.cc    2010-10-06 15:22:47 +0000
>> @@ -2306,6 +2306,9 @@ bool Item_field::eq(const Item *item, bo
>>   }
>>
>>
>> +/**
>> +   This set contains simply the table to which the field belongs.
>> +*/
>
> Please replace "This set contains simply" with "Returns".
done.
>
>>   table_map Item_field::used_tables() const
>>   {
>>     if (field->table->const_table)
>>
>> === modified file 'sql/item.h'
>> --- a/sql/item.h    2010-09-07 19:07:18 +0000
>> +++ b/sql/item.h    2010-10-06 15:22:47 +0000
>> @@ -860,7 +860,13 @@ public:
>>     virtual bool val_bool_result() { return val_bool(); }
>>     virtual bool is_null_result() { return is_null(); }
>>
>> -  /* bit map of tables used by item */
>> +  /**
>> +     This is the set of tables directly referenced by this Item.
>> +
>> +     For pushdown conditions, used_tables is the set tables 
>> referenced by the
>> +     pushdown condition recursively, *including* the table to which the
>> +     condition is pushed.
>> +  */
>
> An item tree can be used for several purposes, including as a 
> "pushdown condition". Thus, I do not think it is a good idea to 
> partially document pushdown conditions here.
>
> I think that you should replace this with:
>
>     Returns the set of tables referenced by this item. An item can be 
> recursively defined
>     in terms of other items, and the set of tables is generally 
> calculated by forming the
>     union of the set of tables referenced by the embedded items.
done.
>>     virtual table_map used_tables() const { return (table_map) 0L; }
>>     /*
>>       Return table map of tables that can't be NULL tables (tables 
>> that are
>>
>> === modified file 'sql/records.h'
>> --- a/sql/records.h    2010-07-13 17:29:44 +0000
>> +++ b/sql/records.h    2010-10-06 15:22:47 +0000
>> @@ -38,6 +38,34 @@ class SQL_SELECT;
>>       ...
>>     }
>>     end_read_record();
>> +
>> +  READ_RECORD is an interface that is used to retrieve records one 
>> by one.
>
> This comment is overlapping with the comment above the usage pattern.
>
>> +
>> +  - An abstraction layer between the query execution engine and the 
>> storage
>> +    engine API. The only read operation the execution engine has to 
>> know of
>> +    is reading the next record. Beneath the abstraction layer is the 
>> storage
>> +    engine API, which is invoked differently for table/index scans 
>> and key
>> +    lookups, and the join buffer.
>> +
>> +  - An executable sub-program, resulting from the optimizer. It is 
>> set up with a
>> +    reference to a handler and a function pointer to a read function 
>> during
>> +    query optimization.
>
> IMHO, this is two ways of trying to explain the same thing. Please be 
> more concrete in your description.
>
> AFAIU, you describe the read_record function pointer here, but you do 
> not describe the unlock_row function pointer, which is also part of 
> the interface.
>
>> +
>> +    The abstraction is incomplete, however. For instace, reading the 
>> first
>> +    record from a handler is done through a function pointer in the 
>> JOIN_TAB
>> +    rather than in its READ_RECORD.
>> +
>> +    Sometimes the READ_RECORD is initialized during query execution.
>
> Is this a good or a bad thing?
>
>> +
>> +    - It is reinitialized during buffered nested loops join, before 
>> the buffer
>> +      is flushed.
>> +
>> +    - For many table accesses, it is initialized right before the 
>> first record
>> +      is read.
>> +
>> +    The READ_RECORD design can be viewed as a strategy pattern, 
>> where the
>> +    abstract strategy is the READ_RECORD struct and the concrete 
>> strategy is
>> +    determined by the values of the function pointer 
>> READ_RECORD::read_record.
>
> Is this description useful to the reader?
I removed the whole comment. Better that way, I don't want to hold up a 
worklog that is 3 years overdue because of comments to code that it 
doesn't even touch.
>
>>   */
>>
>>   class Copy_field;
>>
>> === modified file 'sql/sql_select.cc'
>> --- a/sql/sql_select.cc    2010-09-30 14:53:11 +0000
>> +++ b/sql/sql_select.cc    2010-10-06 15:22:47 +0000
>> @@ -28,6 +28,9 @@
>>   #pragma implementation                // gcc: Class implementation
>>   #endif
>>
>> +#include "sql_set.h"
>> +#include "sql_shortcut.h"
>> +
>>   #include "sql_priv.h"
>>   #include "unireg.h"
>>   #include "sql_select.h"
>> @@ -2468,6 +2471,9 @@ JOIN::optimize()
>>     }
>>
>>     tmp_having= having;
>> +
>> +  setup_shortcuts(this);
>> +
>>     if (select_options&  SELECT_DESCRIBE)
>>     {
>>       error= 0;
>> @@ -16982,6 +16988,7 @@ sub_select_cache(JOIN *join, JOIN_TAB *j
>>     DBUG_ASSERT(cache != NULL);
>>
>>     cache->reset_join(join);
>> +  join->return_tab= join_tab;
>>
>>     DBUG_ENTER("sub_select_cache");
>>
>> @@ -17156,6 +17163,7 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
>>     join_tab->table->null_row=0;
>>     if (end_of_records)
>>     {
>> +    join->return_tab= join_tab;
>>       enum_nested_loop_state nls=
>>         (*join_tab->next_select)(join,join_tab+1,end_of_records);
>>       DBUG_RETURN(nls);
>> @@ -17184,12 +17192,16 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
>>     }
>>     join->thd->warning_info->reset_current_row_for_warning();
>>
>> +  join_tab->current_prefix_successful= FALSE;
>>     error= (*join_tab->read_first_record)(join_tab);
>>
>>     if (join_tab->keep_current_rowid)
>>      
> join_tab->table->file->position(join_tab->table->record[0]);
>> -
>> +
>>     rc= evaluate_join_record(join, join_tab, error);
>> +
>> +  if (shortcut_applies(join_tab, rc))
>> +    join->return_tab= join_tab->shortcut;
>
> I think that this code should be part of evaluate_join_record() instead.
Done.
>>
>>     /*
>>       Note: psergey has added the 2nd part of the following 
>> condition; the
>> @@ -17201,8 +17213,12 @@ sub_select(JOIN *join,JOIN_TAB *join_tab
>>
>>       if (join_tab->keep_current_rowid)
>>        
> join_tab->table->file->position(join_tab->table->record[0]);
>> -
>> +
>>       rc= evaluate_join_record(join, join_tab, error);
>> +
>> +    if (shortcut_applies(join_tab, rc))
>> +      join->return_tab= join_tab->shortcut;
>> +
>>     }
>>
>>     if (rc == NESTED_LOOP_NO_MORE_ROWS&&
>> @@ -17376,6 +17392,8 @@ evaluate_join_record(JOIN *join, JOIN_TA
>>       if (join->thd->is_error())
>>         DBUG_RETURN(NESTED_LOOP_ERROR);
>>     }
>> +  join_tab->current_prefix_successful= found;
>> +
> This may cause the value of "current_prefix_successful" to be false 
> after a scan over a table, even though there were matching rows within 
> the scan. I also suggest that you place this setting immediately 
> before calling next_select.
Thank you for finding a bug. I added a test for it. I also replaced the 
field with a counter instead, as agreed on IRC.
>>     if (found)
>>     {
>>       /*
>> @@ -22797,6 +22815,13 @@ void select_describe(JOIN *join, bool ne
>>               extra.append(STRING_WITH_LEN(", incremental buffers)"));
>>           }
>>
>> +        if (tab->shortcut != NULL)
>> +        {
>> +          extra.append(STRING_WITH_LEN("; StarJoin("));
>> +          extra.append(tab->shortcut->table->alias);
>> +          extra.append(STRING_WITH_LEN(")"));
>> +        }
>
> Would it be better to call it Shortcut(<table>) ?
Done.
>> +
>>           /* Skip initial "; "*/
>>           const char *str= extra.ptr();
>>           uint32 len= extra.length();
>>
>> === modified file 'sql/sql_select.h'
>> --- a/sql/sql_select.h    2010-09-23 12:16:36 +0000
>> +++ b/sql/sql_select.h    2010-10-06 15:22:47 +0000
>> @@ -210,6 +210,17 @@ typedef struct st_join_table : public Sq
>>     TABLE        *table;
>>     KEYUSE    *keyuse;            /**<  pointer to first used key */
>>     SQL_SELECT    *select;
>> +  /**
>> +     This field corresponds to a pushdown condition. It may contain 
>> more than
>> +     just those conditions pushed to the actual table, however.
>> +
>> +     @note Not all pushdown conditions are found in this field. @c 
>> ref access
>> +     conditions are removed from here and are encoded in JOIN_TAB::ref.
>> +
>> +     @note The @c used_tables set for pushdown conditions is the set 
>> of tables
>> +     referenced by the pushdown condition recursively, including the 
>> table to
>> +     which the condition is pushed.
>> +  */
>
> I think a better description is as follows:
>
>     The condition to be evaluated when a row from this table has been 
> made available inside
>     query execution. It may reference a row from all tables in the 
> current join prefix (ie this table
>     and all tables preceeding it in the join plan). Generally, a 
> condition consists of a set of predicates
>     combined using AND, OR and XOR.
>
>     @note Predicates that can be fully evaluated using a ref access 
> condition or that have been
>     pushed down to the storage engine are excluded from this condition.
>
> IMHO, documenting used_tables is irrelevant here.
Done.
>>     Item        *select_cond;
>>     QUICK_SELECT_I *quick;
>>     Item           **on_expr_ref;   /**<  pointer to the associated 
>> on expression   */
>> @@ -351,6 +362,15 @@ typedef struct st_join_table : public Sq
>>     /* NestedOuterJoins: Bitmap of nested joins this table is part of */
>>     nested_join_map embedding_map;
>>
>> +  /**
>> +     This flag is set true once a partial row has been produced for the
>> +     current table that corresponds to the partial row produced for all
>> +     previous tables in the current join.
>> +   */
>
> Please check indentation of comments: 2 columns after start line, and 
> end line on same column as start line.
Done. But is this really worth spending time on?
>
>> +  bool current_prefix_successful;
>> +
>> +  st_join_table *shortcut;
>> +
>>     void cleanup();
>>     inline bool is_using_loose_index_scan()
>>     {
>> @@ -479,7 +499,8 @@ st_join_table::st_join_table()
>>       found_match(FALSE),
>>
>>       keep_current_rowid(0),
>> -    embedding_map(0)
>> +    embedding_map(0),
>> +    shortcut(NULL)
>>   {
>>     /**
>>       @todo Add constructor to READ_RECORD.
>> @@ -1779,12 +1800,13 @@ public:
>>     TABLE_LIST *tables_list;           ///<hold 'tables' parameter of 
>> mysql_select
>>     List<TABLE_LIST>  *join_list;       ///<  list of joined tables 
>> in reverse order
>>     COND_EQUAL *cond_equal;
>> -  /*
>> +  /**
>> +    Interface for taking short-cuts through join execution.
>
> Please delete this comment line. the field is fully described in the 
> comments below,
This is a Doxygen comment. It means that the document is compiled into 
HTML that does not look exactly as the original comment. The first 
sentence in every Doxygen comment is extracted into a "brief" section 
that is displayed next to the member. The rest goes into a "detailed" 
section that goes elsewhere. Hence the first sentence should summarize 
what the rest says.
>>       Join tab to return to. Points to an element of join->join_tab 
>> array, or to
>>       join->join_tab[-1].
>>       This is used at execution stage to shortcut join enumeration. 
>> Currently
>>       shortcutting is done to handle outer joins or handle semi-joins 
>> with
>> -    FirstMatch strategy.
>> +    FirstMatch strategy. It is also used to optimize execution of 
>> star joins.
>
> Please rewrite:
>
>        Shortcutting is done to
>          - handle outer joins, and
>          - handle semi-joins with FirstMatch strategy, and
>          - optimize execution of star joins.
Done.
>
>
>>     */
>>     JOIN_TAB *return_tab;
>>     Item **ref_pointer_array; ///<used pointer reference for this select
>> @@ -1952,6 +1974,37 @@ public:
>>               ((group || tmp_table_param.sum_func_count)&&  
>> !group_list)) ?
>>                 NULL : join_tab+const_tables;
>>     }
>> +
>> +
>> +  /**
>> +     Returns true if JOIN_TAB's a and b participate on the same 
>> nesting level
>> +     within a chain of inner joined tables.
>> +
>> +     @note The optimizer execution plan is considered, and outer 
>> joins may
>> +     have been rewritten into inner joins.
>> +   */
>
> Please check indentation.
Done.
>
>> +  static bool are_inner_joined(JOIN_TAB *a, JOIN_TAB *b)
>> +  {
>> +    DBUG_ASSERT(a<  b);
>> +    return
>> +      a->table->pos_in_table_list->embedding ==
>> +      b->table->pos_in_table_list->embedding&&
>> +      (b->on_expr_ref == NULL || *b->on_expr_ref == NULL);
>> +  }
>
> Is the OR in this expression necessary?
It was when I wrote the patch, there were some tests that would 
otherwise core dump. Even if it isn't needed now (I trust that you 
tried), I dare not remove it.
>> +
>> +
>> +  /**
>> +     True if this query execution plan uses join buffering between 
>> plan nodes
>> +     a and b.
>> +  */
>> +  static bool contains_join_buffering(JOIN_TAB *a, JOIN_TAB *b)
>> +  {
>> +    for (JOIN_TAB *join_tab= a; join_tab<= b; join_tab++)
>> +      if (join_tab->use_join_cache)
>> +        return TRUE;
>> +    return FALSE;
>> +  }
>> +
>>   private:
>>     /**
>>       TRUE if the query contains an aggregate function but has no GROUP
>>
>> === added file 'sql/sql_set.h'
>> --- a/sql/sql_set.h    1970-01-01 00:00:00 +0000
>> +++ b/sql/sql_set.h    2010-10-06 15:22:47 +0000
>> @@ -0,0 +1,111 @@
>> +#ifndef INCLUDES_MYSQL_SQL_SET_H
>> +#define INCLUDES_MYSQL_SQL_SET_H
>> +/* Copyright (c) 2000, 2010 Oracle and/or its affiliates. All rights 
>> reserved.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; version 2 of the License.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program; if not, write to the Free Software
>> +   Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 
>> 02110-1301  USA */
>> +
>> +#include "my_global.h"
>> +
>> +template<class T>  class Bitmap_set_iterator;
>> +template<class T>  class Bitmap_set_reverse_iterator;
>> +
>> +/**
>> +   A set implemented as a bitmap. Due to this implementation, 
>> operations that
>> +   normally are quite expensive, typically having O(n log n) 
>> complexity, are
>> +   extremely cheap. A set consists of a bitmap and an array of 
>> pointers to
>> +   elements, the backing store. Each n:th bit in the bitmap 
>> represents the
>> +   member at position n in the array, where the first member has 
>> position 0. A
>> +   set consisting of this element is represented by the bitmap 1.
>> +
>> +   For obvious reasons, it cannot be allowed to perform a union 
>> operation on
>> +   two sets that have different backing stores, even if their type 
>> is the
>> +   same. Attempts at such operations will result in failed assertions.
>> + */
>> +template<class T>  class Bitmap_set {
>> +
>> +private:
>> +  ulonglong m_map;
>> +  T** m_backing_store;
>> +
>> +public:
>> +  Bitmap_set(ulonglong map, T** backing_store) :
>> +    m_map(map), m_backing_store(backing_store) {}
>> +
>> +    bool is_empty() { return m_map == 0; }
>> +
>> +    Bitmap_set<T>  set_union(const Bitmap_set<T>&  other) const
>> +    {
>> +      DBUG_ASSERT(m_backing_store == other.m_backing_store);
>> +      return Bitmap_set<T>(m_map | other.m_map, m_backing_store);
>> +    }
>> +
>> +    friend class Bitmap_set_iterator<T>;
>> +    friend class Bitmap_set_reverse_iterator<T>;
>> +};
>> +
>> +template<class T>  class Bitmap_set_iterator {
>> +private:
>> +  const Bitmap_set<T>&  m_set;
>> +  uint m_current_element;
>> +
>> +public:
>> +  Bitmap_set_iterator(const Bitmap_set<T>&  set) :
>> +    m_set(set), m_current_element(0) {}
>> +
>> +  inline T* operator++(int) {
>> +    ulonglong mask= 1<<  m_current_element;
>> +    while ((mask&  m_set.m_map) == 0)
>> +    {
>> +      if (mask == 0 || mask>  m_set.m_map)
>> +        return NULL;
>> +      mask<<= 1;
>> +      ++m_current_element;
>> +    }
>> +    T* element= m_set.m_backing_store[m_current_element++];
>> +    return element;
>> +  }
>> +
>> +};
>> +
>> +template<class T>  class Bitmap_set_reverse_iterator {
>> +private:
>> +  const Bitmap_set<T>&  m_set;
>> +  int m_current_element;
>> +
>> +public:
>> +  Bitmap_set_reverse_iterator(const Bitmap_set<T>&  set) :
>> +    m_set(set), m_current_element(0)
>> +  {
>> +    ulonglong map= set.m_map;
>> +    while ((map>>= 1) != 0)
>> +      ++m_current_element;
>> +  }
>> +
>> +  inline T* operator++(int) {
>> +    if (m_current_element == -1)
>> +      return NULL;
>> +    ulonglong mask= 1<<  m_current_element;
>> +    while ((mask&  m_set.m_map) == 0)
>> +    {
>> +      if (mask == 0)
>> +        return NULL;
>> +      mask>>= 1;
>> +      --m_current_element;
>> +    }
>> +    T* element= m_set.m_backing_store[m_current_element--];
>> +    return element;
>> +  }
>> +
>> +};
>> +#endif // INCLUDES_MYSQL_SQL_SET_H
>>
>> === added file 'sql/sql_shortcut.h'
>> --- a/sql/sql_shortcut.h    1970-01-01 00:00:00 +0000
>> +++ b/sql/sql_shortcut.h    2010-10-06 15:22:47 +0000
>> @@ -0,0 +1,153 @@
>> +#ifndef INCLUDES_MYSQL_SQL_SHORTCUT_H
>> +#define INCLUDES_MYSQL_SQL_SHORTCUT_H
>
> Please add the contents of this file to sql_select.cc We do not 
> usually include implementation contents from a header file.
Done.
>
> You can convert the functions into member functions of class JOIN 
> and/or JOIN_TAB, if you like.
Nah, these classes have too bloated interfaces as it is.
>> +
>> +/* Copyright (c) 2000, 2010, Oracle and/or its affiliates. All 
>> rights reserved.
>> +
>> +   This program is free software; you can redistribute it and/or modify
>> +   it under the terms of the GNU General Public License as published by
>> +   the Free Software Foundation; version 2 of the License.
>> +
>> +   This program is distributed in the hope that it will be useful,
>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> +   GNU General Public License for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with this program; if not, write to the Free Software
>> +   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  
>> 02111-1307  USA */
>> +
>> +#include "sql_select.h"
>> +#include "sql_set.h"
>> +
>> +
>> +/**
>> +   Updates a short-cut from from one plan node to another. The function
>> +   assumes that it is invoked on plan nodes in reverse order in 
>> which they
>> +   appear in the query execution plan.
>> +
>> +   @param join The execution plan.
>> +
>> +   @param join_tab The plan node from which a short-cut may 
>> potentially be
>> +   taken.
>> +
>> +   @param dependent A plan node that access to join_tab depends on.
>> +
>> +   @retval TRUE There may be other short-cuts possible early in the 
>> plan.
>> +   @retval FALSE There are no earlier short-cuts available. Either 
>> the only
>> +   possible one was found and completely initialized, or there are 
>> none.
>> +*/
>> +static bool update_star_dependency(JOIN *join, JOIN_TAB *join_tab,
>> +                                   JOIN_TAB *dependency)
>
> "dependent" is probably more descriptive than "dependency" (for the 
> argument)
Done.
>> +{
>> +  if (dependency == NULL)
>> +    return FALSE;
>
> You may use true/false now, if you like,
Done.
>> +
>> +  if (join_tab<= dependency)
>> +    return TRUE;
>> +
>> +  if (dependency == join_tab - 1)
>> +  {
>> +    DBUG_PRINT("info", ("Invalidated a join short-cut from %s to %s",
>> +                        join_tab->table->alias,
>> +                        dependency->table->alias));
>> +    return FALSE;
>> +  }
>> +  else if (join->are_inner_joined(dependency, join_tab)&&
>> +           !join->contains_join_buffering(dependency, join_tab))
>> +  {
>> +    DBUG_PRINT("info", ("Found a join short-cut from %s to %s",
>> +                        join_tab->table->alias,
>> +                        dependency->table->alias));
>> +
>> +    join_tab->shortcut= dependency;
>> +
>> +    return FALSE;
>> +  }
>> +  return TRUE;
>> +}
>> +
>> +
>> +/**
>> +   Sets the short-cut from from one plan node to another, if one is 
>> found.
>
> from from
fixed.
>
>> +
>> +   @param join The execution plan.
>> +
>> +   @param join_tab The plan node from which a short-cut 
>> willpotentially be
>> +   taken.
>> +*/
>> +static void set_star_dependency(JOIN *join, JOIN_TAB *join_tab)
>
> I do not see that the Bitmap_set is justified by this use. The 
> following code is briefer and still easy to validate:
>
>   table_map dependent_tables= 0L;
>   if (join_tab->select_cond != NULL)
>   {
>     Item *condition= join_tab->select_cond;
>     condition->update_used_tables();
>     dependent_tables= condition->used_tables()
>   }
>   dependent_tables= (dependent_tables | join_tab->ref.depend_map) &
>                                (~PSEUDO_TABLE_BITS) & 
> (~join_tab->table->map);
>
>   if (dependent_tables == 0)
>     return;
Of course it is...
>
> <Iterate over the bitmap and call update_star_dependency() for each 
> dependent>
...this code is what suffers.
>
> It may be a good idea to add a Bitmap_set as part of a later 
> refactoring, but for this use it seems that it just adds more code 
> with no obvious gain.
I disagree. However, it has become more convenient for me to break up 
the worklog anyway.
>> +{
>> +  table_map where_clause_tables_map;
>> +  if (join_tab->select_cond != NULL)
>> +  {
>> +    Item *where_condition= join_tab->select_cond;
>> +    where_condition->update_used_tables();
>> +    where_clause_tables_map=
>> +      where_condition->used_tables()&  ~PSEUDO_TABLE_BITS;
>> +  }
>> +  else
>> +    where_clause_tables_map= (table_map)0L;
>> +
>> +  table_map ref_access_tables_map=
>> +    join_tab->ref.depend_map&  ~PSEUDO_TABLE_BITS;
>> +
>> +  Bitmap_set<JOIN_TAB>
>> +    where_clause_tables(where_clause_tables_map, join->map2table);
>> +
>> +  Bitmap_set<JOIN_TAB>
>> +    ref_access_tables(ref_access_tables_map, join->map2table);
>> +
>> +  Bitmap_set<JOIN_TAB>  tables_depended_upon=
>> +    where_clause_tables.set_union(ref_access_tables);
>> +
>> +  if (tables_depended_upon.is_empty())
>> +    return;
>> +
>> +  for(Bitmap_set_reverse_iterator<JOIN_TAB>  it(tables_depended_upon);
>> +      update_star_dependency(join, join_tab, it++);) ;
>
> Missing space.
Fixed.
>> +}
>> +
>> +
>> +/**
>> +   Pre-compute short-cuts for the join short-cut optimization.
>> +
>> +   This function adorns the query exeution plan with star join 
>> short-cuts
>> +   where applicable. Short-cuts are triggered by the executioner. The
>> +   algorithm is implemented as a nested loop, where the outer loop 
>> walks the
>> +   nodes (JOIN_TAB's) in the query execution plan, and the inner 
>> loop iterates
>> +   through all tables that this table depends upon. Valid short-cuts 
>> have the
>> +   following property.
>
> properties:
Fixed.
>> +
>> +   - The short-cut has a minimum length of 2. If a table is 
>> dependent on the
>> +   table directly preceding it, the short-cut would be of length 1, 
>> and it
>> +   would indicate that there are no valid short-cuts from this table.
>> +
>> +   - The short-cut does not cross an outer join boundary. In this 
>> case the
>> +   optimization is not applicable, because a valid result would be 
>> missed.
>> +
>> +   - The short-cut does not cross a node using join buffers. 
>> Short-cuts are
>> +   not compatible with join buffering.
>> +
>> +   - If there are several applicable shortcuts by the above 
>> requirements, the
>> +   shortest one is chosen.
>> +
>> +   @param join The query execution plan. The plan is assumed to be 
>> complete
>> +   and ready for execution when this method is invoked.
>> +*/
>> +static void setup_shortcuts(JOIN *join)
>> +{
>> +  if(join->join_tab == NULL)
>> +    return;
>
> You may DBUG_ASSERT on join->join_tab existence instead.
That will only force me to move the if further up the call stack, so 
frankly I don't see the point.
>
>> +
>> +  for (uint i= 2; i<  join->tables; i++)
>> +    set_star_dependency(join,&join->join_tab[i]);
>> +}
>
> I think that should be "uint= join->const_tables + 2" instead.
Good point. Fixed.
>> +
>> +bool shortcut_applies(const JOIN_TAB *join_tab, 
>> enum_nested_loop_state rc)
>> +{
>> +  return rc != NESTED_LOOP_OK&&
>> +    !join_tab->current_prefix_successful&&
>> +    join_tab->shortcut != NULL;
>> +}
> If you move this code to evaluate_join_record(), you can inline the 
> code as follows:
>
>   if (shortcut != NULL&&
>       !current_prefix_successful)
>     return_tab= shortcut;
> }
>
> Notice the order of operands:  shortcut != NULL is less likely than 
> current_prefix_successful.
Good idea. Now why didn't I think of that?
>
>> +
>> +#endif // INCLUDES_MYSQL_SQL_SHORTCUT_H
>>
>> === modified file 'sql/structs.h'
>> --- a/sql/structs.h    2010-07-13 17:29:44 +0000
>> +++ b/sql/structs.h    2010-10-06 15:22:47 +0000
>> @@ -119,6 +119,27 @@ struct st_join_table;
>>   typedef struct st_reginfo {        /* Extra info about reg */
>>     struct st_join_table *join_tab;    /* Used by SELECT() */
>>     enum thr_lock_type lock_type;        /* How database is used */
>> +  /**
>> +     This is an outer join optimization. The idea is to avoid
>> +     NULL-complementing a row for the inner table(s). The bespoke 
>> case is when
>> +     we have a query such as
> Wrong... The idea is to avoid reading rows from the inner tables that 
> we know will be discarded when evaluating the WHERE clause.
>> +
>> +     @code
>> +
>> +     SELECT ...
>> +     FROM t1 LEFT OUTER JOIN (t2, t3) ON<condition>
>> +     WHERE t2.non_nullable_column IS NULL
>> +
>> +     @endcode
>> +
>> +     In this case not_exists_optimize would be set for t2.
>> +
>> +     Once the query execution reaches the place where<condition>  
>> holds for
>> +     the current result tuple, we know that there cannot be a row 
>> where the
>> +     WHERE condition holds. There is no need produce a 
>> NULL-complemented row.
>> +
>> +     @see update_ref_and_keys
>> +  */
>
> Please see bug#58928 that Guilhem just added. The implementation and 
> interpretation of this will most likely change because of this bug.
I read between the lines here...best to withdraw the whole comment for 
the time being. No documentation is better than the wrong documentation.

Thank you for the code review.

Best Regards

Martin

Thread
Re: bzr commit into mysql-next-mr-opt-team branch (martin.hansson:3225)WL#3724Roy Lyseng21 Dec
  • Re: bzr commit into mysql-next-mr-opt-team branch (martin.hansson:3225)WL#3724Martin Hansson18 Jan