List:Commits« Previous MessageNext Message »
From:Roy Lyseng Date:April 8 2011 12:15pm
Subject:Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522
View as plain text  
Hi Guilhem,

thank you for the new commit.

The patch is approved by me.

Please see some inline minor comments, which you are free to ignore.

On 01.04.11 11.09, Guilhem Bichot wrote:
> #At file:///home/mysql_src/bzrrepos_new/59651/ based on
> revid:roy.lyseng@stripped
>
>   3348 Guilhem Bichot	2011-04-01
>        Fix for BUG#11766522 "59651: ASSERTION `TABLE_REF->HAS_RECORD' FAILED WITH
> JOIN_CACHE_LEVEL=3":
>        during execution, access to last-in-plan tables spoiled TABLE::status of first
> tables.
>       @ mysql-test/include/join_cache.inc
>          test for bug. Without the code fix, join_cache_jcl3 would hit an assertion
> failure in join_read_key2.
>       @ mysql-test/r/join_cache_jcl1.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl2.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl3.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl4.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl5.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl6.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl7.result
>          correct result (32 rows)
>       @ mysql-test/r/join_cache_jcl8.result
>          correct result (32 rows)
>       @ sql/handler.h
>          Removing the assignments to table->status
>          from ha_innodb.cc, ha_myisam.cc, ha_heap.cc causes test failures,
>          which shows that engines must set table->status.
>          So tell this brutal truth, in comments: engines must set table->status.
>          All those which I could check, do so (including PBXT).
>          Question for reviewers: most of the affected handler functions
>          have ha_* non-virtual wrappers (ha_index_next_same(), etc);
>          we could, in a follow-up patch:
>          - remove the requirement from the engine (don't set table->status
>          in their functions anymore)
>          - set table->status in the ha_* wrappers.
>          - make STATUS_ * flags invisible to engines
>          - change them to an enum, and replace the magic value "0"
>          with an enum value.
>       @ sql/sql_join_cache.cc
>          Deleting broken code which caused the bug. It was broken for two reasons.
>          First, it affected JOIN_TABs located _before_ 'join_tab' in the plan;
>          there is no reason why reading rows from 'join_tab' should affect
>          TABLE::status of previous JOIN_TABs.
>          Second, the save/restore logic was not recursion-safe; consider:
>          int save_x, x; // global variables
>          f()
>          {
>            save_x= x; // save
>            f();
>            x= save_x; // restore
>          }
>          the inner call to f() will overwrite the value of save_x,
>          in the end x will not get its original value back.
>          That is what caused the assertion failure. In the bug's testcase,
>          the last tables of the plan are t7, t8, t6. t7 is doing ref access,
>          t8 and t6 are doing join buffering. During execution,
>          at some point, no row is found in t7 so t7's table->status is set to
>          STATUS_NOT_FOUND // because no record found
>          | STATUS_NULL_ROW // because NULL-complemented (LEFT JOIN)
>          Then we continue into t8: at some point the join buffer of t8 is full,
>          so t8 cannot accumulate more partial records sent by t7, so
>          join_records() is called for t8, to generate records (of t8 joined
>          with its buffer), in order to be able to later empty t8's buffer.
>          Those records are generated, which backs up t7's table->status
>          into t7's tab->status, and changes table->status to 0.
>          Generated records are passed to t6. t6's buffer gets full. So
>          join_records() is called for t6. Which backs up t7's table->status
>          (0) into t7's tab->status. From then on, things are wrong, as when
>          the two join_records() finally end, t7's table->status is not
>          restored to its original value (STATUS_NOT_FOUND|STATUS_NULL_ROW)
>          but to 0 ("save_x" issue above). After that, more rows come from tables
>          before t7, so we have another ref access to t7 i.e. a call to
> join_read_key2(),
>          which has this code:
>            if (cmp_buffer_with_ref(tab->join->thd, table, table_ref) ||
>                (table->status&  (STATUS_GARBAGE | STATUS_NULL_ROW)))
>            {
>          ...
>            }
>            else if (table->status == 0)
>            {
>              DBUG_ASSERT(table_ref->has_record);
>              table_ref->use_count++;
>            }
>          The "else" branch is expected to be for when:
>          - the lookup value is the same as the previous lookup value (see
> cmp_buffer_with_ref())
>          - the previous lookup did find a record (i.e. table->status==0)
>          - in which case there should be a record ready for reusal
>          (which is asserted with table_ref->has_record) and a new lookup can be
> avoided.
>          But in our case, though the lookup value is the same, there
>          is no record; if t7's table->status was correct it would still
>          have STATUS_NULL_ROW and we wouldn't enter the "else" block.
>          Because t7's table->status is wrongly 0, we enter the "else"
>          and the assertion fires, because there is no record actually.
>
>          This code dates from 2000 at the latest (when code was imported into
> bitkeeper).
>
>          It is unlikely that this causes a real bug in 5.1 or 5.5 because:
>          - those older versions do join buffering only for inner join
>          - in an inner join, if the "ref" access finds no rows, we don't
>          go into the "next" tables, so have no chance to corrupt table->status
>          of the table doing "ref" access.
>       @ sql/sql_select.cc
>          unused flag (was never set)
>       @ sql/sql_select.h
>          JOIN_TAB::status not needed anymore
>       @ sql/structs.h
>          Best-effort documentation of STATUS_ flags, to the extent of what I
> understood.
>          Deleting STATUS_PARENT and STATUS_NOT_READ: they were never set, and this is
> sure:
>          Deleting STATUS_NO_RECORD.
>          I temporarily changed TABLE::status to an enum so that any
>          table->status= 8; /* using 8 instead of STATUS_NOT_READ */
>          would be caught by the compiler.
>
>      modified:
>        mysql-test/include/join_cache.inc
>        mysql-test/r/join_cache_jcl1.result
>        mysql-test/r/join_cache_jcl2.result
>        mysql-test/r/join_cache_jcl3.result
>        mysql-test/r/join_cache_jcl4.result
>        mysql-test/r/join_cache_jcl5.result
>        mysql-test/r/join_cache_jcl6.result
>        mysql-test/r/join_cache_jcl7.result
>        mysql-test/r/join_cache_jcl8.result
>        sql/handler.h
>        sql/item_cmpfunc.cc
>        sql/sql_base.h
>        sql/sql_join_cache.cc
>        sql/sql_select.cc
>        sql/sql_select.h
>        sql/structs.h
>        sql/table.cc
>
<Long test section clipped away>
> === modified file 'sql/handler.h'
> --- a/sql/handler.h	2011-01-26 21:12:56 +0000
> +++ b/sql/handler.h	2011-04-01 09:09:15 +0000
> @@ -1799,9 +1799,12 @@ public:
>     }
>     /**
>        @brief
> -     Positions an index cursor to the index specified in the handle. Fetches the
> -     row if available. If the key value is null, begin at the first key of the
> -     index.
> +     Positions an index cursor to the index specified in the handle
> +     ('active_index'). Fetches the row if available. If the key value is null,
> +     begin at the first key of the index.
> +     To indicate success (found a record), function returns 0 _and_
> +     sets table->status to 0. Otherwise, returns non-zero and sets
> +     table->status to STATUS_NOT_FOUND.
>     */
>     virtual int index_read_map(uchar * buf, const uchar * key,
>                                key_part_map keypart_map,
> @@ -1813,27 +1816,34 @@ public:
>   protected:
>     /**
>        @brief
> -     Positions an index cursor to the index specified in the handle. Fetches the
> -     row if available. If the key value is null, begin at the first key of the
> -     index.
> +     Positions an index cursor to the index specified in parameter. Fetches
nitpick: C++ uses arguments, not parameters.
> +     the row if available. If the key value is null, begin at the first key of
> +     the index.
> +     @returns as index_read_map().
>     */
>     virtual int index_read_idx_map(uchar * buf, uint index, const uchar * key,
>                                    key_part_map keypart_map,
>                                    enum ha_rkey_function find_flag);
> +  /// @returns as index_read_map().
>     virtual int index_next(uchar * buf)
>      { return  HA_ERR_WRONG_COMMAND; }
> +  /// @returns as index_read_map().
>     virtual int index_prev(uchar * buf)
>      { return  HA_ERR_WRONG_COMMAND; }
> +  /// @returns as index_read_map().
>     virtual int index_first(uchar * buf)
>      { return  HA_ERR_WRONG_COMMAND; }
> +  /// @returns as index_read_map().
>     virtual int index_last(uchar * buf)
>      { return  HA_ERR_WRONG_COMMAND; }

Proposal: Add a blank line between the definitions above.
>   public:
> +  /// @returns as index_read_map().
>     virtual int index_next_same(uchar *buf, const uchar *key, uint keylen);
>     /**
>        @brief
>        The following functions works like index_read, but it find the last
>        row with the current key value or prefix.
> +     @returns as index_read_map().
>     */
>     virtual int index_read_last_map(uchar * buf, const uchar * key,
>                                     key_part_map keypart_map)
> @@ -1853,7 +1863,9 @@ public:
>       { return NULL; }
>     virtual int ft_read(uchar *buf) { return HA_ERR_WRONG_COMMAND; }
>   protected:
> +  /// @returns as index_read_map().
>     virtual int rnd_next(uchar *buf)=0;
> +  /// @returns as index_read_map().
>     virtual int rnd_pos(uchar * buf, uchar *pos)=0;
>   public:
>     /**
>
> === modified file 'sql/item_cmpfunc.cc'
> --- a/sql/item_cmpfunc.cc	2011-02-08 15:54:12 +0000
> +++ b/sql/item_cmpfunc.cc	2011-04-01 09:09:15 +0000
> @@ -430,7 +430,8 @@ static bool convert_constant_item(THD *t
>       */
>       bool save_field_value= (field_item->depended_from&&
>                               (field_item->const_item() ||
> -                             !(field->table->status& 
> STATUS_NO_RECORD)));
> +                             !(field->table->status&
> +                               (STATUS_GARBAGE | STATUS_NOT_FOUND))));
>       if (save_field_value)
>         orig_field_val= field->val_int();
>       if (!(*item)->is_null()&&  !(*item)->save_in_field(field, 1))
>
> === modified file 'sql/sql_base.h'
> --- a/sql/sql_base.h	2010-11-23 22:37:59 +0000
> +++ b/sql/sql_base.h	2011-04-01 09:09:15 +0000
> @@ -323,7 +323,7 @@ inline void setup_table_map(TABLE *table
>     table->used_fields= 0;
>     table->const_table= 0;
>     table->null_row= 0;
> -  table->status= STATUS_NO_RECORD;
> +  table->status= STATUS_GARBAGE | STATUS_NOT_FOUND;
>     table->maybe_null= table_list->outer_join;
>     TABLE_LIST *embedding= table_list->embedding;
>     while (!table->maybe_null&&  embedding)
>
> === modified file 'sql/sql_join_cache.cc'
> --- a/sql/sql_join_cache.cc	2010-09-02 07:34:10 +0000
> +++ b/sql/sql_join_cache.cc	2011-04-01 09:09:15 +0000
> @@ -1761,7 +1761,6 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>   {
>     uint cnt;
>     int error;
> -  JOIN_TAB *tab;
>     READ_RECORD *info;
>     enum_nested_loop_state rc= NESTED_LOOP_OK;
>     SQL_SELECT *select= join_tab->cache_select;
> @@ -1788,18 +1787,9 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>       join_tab->select->quick= 0;
>     }
>
> -  for (tab= join->join_tab; tab != join_tab ; tab++)
> -  {
> -    tab->status= tab->table->status;
> -    tab->table->status= 0;
> -  }
> -
>     /* Start retrieving all records of the joined table */
>     if ((error= join_init_read_record(join_tab)))
> -  {
> -    rc= error<  0 ? NESTED_LOOP_NO_MORE_ROWS: NESTED_LOOP_ERROR;
> -    goto finish;
> -  }
> +    return error<  0 ? NESTED_LOOP_NO_MORE_ROWS: NESTED_LOOP_ERROR;
>
>     info=&join_tab->read_record;
>     do
> @@ -1811,8 +1801,7 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>       {
>         /* The user has aborted the execution of the query */
>         join->thd->send_kill_message();
> -      rc= NESTED_LOOP_KILLED;
> -      goto finish;
> +      return NESTED_LOOP_KILLED;
>       }
>
>       /*
> @@ -1826,10 +1815,7 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>                               
> (!select->skip_record(join->thd,&skip_record)&&
>                                 !skip_record));
>         if (select&&  join->thd->is_error())
> -      {
> -        rc= NESTED_LOOP_ERROR;
> -        goto finish;
> -      }
> +        return NESTED_LOOP_ERROR;
>         if (consider_record)
>         {
>           /* Prepare to read records from the join buffer */
> @@ -1847,7 +1833,7 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>               get_record();
>               rc= generate_full_extensions(get_curr_rec());
>               if (rc != NESTED_LOOP_OK&&  rc != NESTED_LOOP_NO_MORE_ROWS)
> -              goto finish;
> +              return rc;
>             }
>           }
>         }
> @@ -1856,9 +1842,6 @@ enum_nested_loop_state JOIN_CACHE_BNL::j
>
>     if (error>  0)				// Fatal error
>       rc= NESTED_LOOP_ERROR;
> -finish:
> -  for (tab= join->join_tab; tab != join_tab ; tab++)
> -    tab->table->status= tab->status;
>     return rc;
>   }
>
> @@ -2299,7 +2282,7 @@ enum_nested_loop_state JOIN_CACHE_BKA::j
>
>     rc= init_join_matching_records(&seq_funcs, records);
>     if (rc != NESTED_LOOP_OK)
> -    goto finish;
> +    return rc;
>
>     while (!(error= file->multi_range_read_next((char **)&rec_ptr)))
>     {
> @@ -2307,8 +2290,7 @@ enum_nested_loop_state JOIN_CACHE_BKA::j
>       {
>         /* The user has aborted the execution of the query */
>         join->thd->send_kill_message();
> -      rc= NESTED_LOOP_KILLED;
> -      goto finish;
> +      return NESTED_LOOP_KILLED;
>       }
>       if (join_tab->keep_current_rowid)
>         join_tab->table->file->position(join_tab->table->record[0]);
> @@ -2323,14 +2305,13 @@ enum_nested_loop_state JOIN_CACHE_BKA::j
>         get_record_by_pos(rec_ptr);
>         rc= generate_full_extensions(rec_ptr);
>         if (rc != NESTED_LOOP_OK&&  rc != NESTED_LOOP_NO_MORE_ROWS)
> -	goto finish;
> +        return rc;
>       }
>     }
>
>     if (error>  0&&  error != HA_ERR_END_OF_FILE)	
>       return NESTED_LOOP_ERROR;
> -finish:
> -  return end_join_matching_records(rc);
> +  return rc;
>   }
>
>
> @@ -2375,12 +2356,6 @@ JOIN_CACHE_BKA::init_join_matching_recor
>     /* Dynamic range access is never used with BKA */
>     DBUG_ASSERT(join_tab->use_quick != 2);
>
> -  for (JOIN_TAB *tab =join->join_tab; tab != join_tab ; tab++)
> -  {
> -    tab->status= tab->table->status;
> -    tab->table->status= 0;
> -  }
> -
>     init_mrr_buff();
>
>     /*
> @@ -2398,31 +2373,6 @@ JOIN_CACHE_BKA::init_join_matching_recor
>
>
>   /*
> -  Finish searching for records that match records from the join buffer
> -
> -  SYNOPSIS
> -    end_join_matching_records()
> -      rc      return code passed by the join_matching_records function
> -
> -  DESCRIPTION
> -    This function perform final actions on searching for all matches for
> -    the records from the join buffer and building all full join extensions
> -    of the records with these matches.
> -
> -  RETURN
> -    return code rc passed to the function as a parameter
> -*/
> -
> -enum_nested_loop_state
> -JOIN_CACHE_BKA::end_join_matching_records(enum_nested_loop_state rc)
> -{
> -  for (JOIN_TAB *tab=join->join_tab; tab != join_tab ; tab++)
> -    tab->table->status= tab->status;
> -  return rc;
> -}
> -
> -
> -/*
>     Get the key built over the next record from BKA join buffer
>
>     SYNOPSIS
> @@ -3192,7 +3142,7 @@ JOIN_CACHE_BKA_UNIQUE::join_matching_rec
>
>     rc= init_join_matching_records(&seq_funcs, key_entries);
>     if (rc != NESTED_LOOP_OK)
> -    goto finish;
> +    return rc;
>
>     while (!(error= file->multi_range_read_next((char **)&key_chain_ptr)))
>     {
> @@ -3227,8 +3177,7 @@ JOIN_CACHE_BKA_UNIQUE::join_matching_rec
>         {
>           /* The user has aborted the execution of the query */
>           join->thd->send_kill_message();
> -        rc= NESTED_LOOP_KILLED;
> -        goto finish;
> +        return NESTED_LOOP_KILLED;
>         }
>         /*
>           If only the first match is needed and it has been already found
> @@ -3241,7 +3190,7 @@ JOIN_CACHE_BKA_UNIQUE::join_matching_rec
>           get_record_by_pos(rec_ptr);
>           rc= generate_full_extensions(rec_ptr);
>           if (rc != NESTED_LOOP_OK&&  rc != NESTED_LOOP_NO_MORE_ROWS)
> -	  goto finish;
> +          return rc;
>         }
>       }
>       while (next_rec_ref_ptr != last_rec_ref_ptr);
> @@ -3249,8 +3198,7 @@ JOIN_CACHE_BKA_UNIQUE::join_matching_rec
>
>     if (error>  0&&  error != HA_ERR_END_OF_FILE)	
>       return NESTED_LOOP_ERROR;
> -finish:
> -  return end_join_matching_records(rc);
> +  return rc;
>   }
>
>
>
> === modified file 'sql/sql_select.cc'
> --- a/sql/sql_select.cc	2011-03-03 09:43:14 +0000
> +++ b/sql/sql_select.cc	2011-04-01 09:09:15 +0000
> @@ -11168,7 +11168,7 @@ make_join_readinfo(JOIN *join, ulonglong
>       */
>       tab->sorted= sorted;
>       sorted= 0;                                  // only first must be sorted
> -    table->status=STATUS_NO_RECORD;
> +    table->status= STATUS_GARBAGE | STATUS_NOT_FOUND;
>       pick_table_access_method (tab);
>
>       if (tab->loosescan_match_tab)
> @@ -17960,7 +17960,7 @@ join_read_const_table(JOIN_TAB *tab, POS
>     TABLE *table=tab->table;
>     table->const_table=1;
>     table->null_row=0;
> -  table->status=STATUS_NO_RECORD;
> +  table->status= STATUS_GARBAGE | STATUS_NOT_FOUND;
>
>     if (tab->type == JT_SYSTEM)
>     {
> @@ -18154,7 +18154,7 @@ join_read_key2(JOIN_TAB *tab, TABLE *tab
>       indices on NOT NULL columns (see create_ref_for_key()).
>     */
>     if (cmp_buffer_with_ref(tab->join->thd, table, table_ref) ||
> -      (table->status&  (STATUS_GARBAGE | STATUS_NO_PARENT |
> STATUS_NULL_ROW)))
> +      (table->status&  (STATUS_GARBAGE | STATUS_NULL_ROW)))
>     {
>       if (table_ref->key_err)
>       {
>
> === modified file 'sql/sql_select.h'
> --- a/sql/sql_select.h	2011-03-01 14:57:53 +0000
> +++ b/sql/sql_select.h	2011-04-01 09:09:15 +0000
> @@ -296,7 +296,6 @@ typedef struct st_join_table : public Sq
>
>     table_map	dependent,key_dependent;
>     uint		index;
> -  uint		status;				///<  Save status for cache
>     uint		used_fields,used_fieldlength,used_blobs;
>     uint          used_null_fields;
>     uint          used_rowid_fields;
> @@ -470,7 +469,6 @@ st_join_table::st_join_table()
>       dependent(0),
>       key_dependent(0),
>       index(0),
> -    status(0),
>       used_fields(0),
>       used_fieldlength(0),
>       used_blobs(0),
> @@ -1032,9 +1030,6 @@ protected:
>     enum_nested_loop_state init_join_matching_records(RANGE_SEQ_IF *seq_funcs,
>                                                       uint ranges);
>
> -  /* Finish searching for records that match records from the join buffer */
> -  enum_nested_loop_state end_join_matching_records(enum_nested_loop_state rc);
> -
>   public:
>
>     /*
>
> === modified file 'sql/structs.h'
> --- a/sql/structs.h	2010-10-21 09:49:16 +0000
> +++ b/sql/structs.h	2011-04-01 09:09:15 +0000
> @@ -231,13 +231,20 @@ typedef struct  user_conn {
>   #define REG_SKIP_DUP		256
>
>   	/* Bits in form->status */
> -#define STATUS_NO_RECORD	(1+2)	/* Record isn't usably */
>   #define STATUS_GARBAGE		1
> -#define STATUS_NOT_FOUND	2	/* No record in database when needed */
> -#define STATUS_NO_PARENT	4	/* Parent record wasn't found */
> -#define STATUS_NOT_READ		8	/* Record isn't read */
> -#define STATUS_UPDATED		16	/* Record is updated by formula */
> -#define STATUS_NULL_ROW		32	/* table->null_row is set */
> +/**
> +   Means we were searching for a row and didn't find it. This is used by
> +   storage engines (@see handler::index_read_map()) and the Server layer.
> +*/
> +#define STATUS_NOT_FOUND	2
> +/// Reserved for use by multi-table update. Means the row has been updated.
> +#define STATUS_UPDATED		16
> +/**
> +   Means that table->null_row is set. This is an artificial NULL-filled row
> +   (one example: in outer join, if no match has been found in inner table).
> +*/
> +#define STATUS_NULL_ROW		32
> +/// Reserved for use by multi-table delete. Means the row has been deleted.
>   #define STATUS_DELETED		64

One problem here is that the value zero is a distinct value, but there is no 
documentation. But I do accept your excuse for not extending the documentation 
further.
>
>   /*
>
> === modified file 'sql/table.cc'
> --- a/sql/table.cc	2011-02-09 15:17:14 +0000
> +++ b/sql/table.cc	2011-04-01 09:09:15 +0000
> @@ -3301,7 +3301,7 @@ void TABLE::init(THD *thd, TABLE_LIST *t
>     force_index= 0;
>     force_index_order= 0;
>     force_index_group= 0;
> -  status= STATUS_NO_RECORD;
> +  status= STATUS_GARBAGE | STATUS_NOT_FOUND;

BTW, I replaced all "status= STATUS_GARBAGE | STATUS_NOT_FOUND" assigments with 
"status= STATUS_GARBAGE", and there are no test failures after that... But that 
is not a proof that it is a safe change, of course.
>     insert_values= 0;
>     fulltext_searched= 0;
>     file->ft_handler= 0;
Thanks,
Roy
Thread
bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Guilhem Bichot1 Apr
  • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Øystein Grøvlen6 Apr
    • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Guilhem Bichot26 Apr
      • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Guilhem Bichot26 Apr
  • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Roy Lyseng8 Apr
    • Re: bzr commit into mysql-trunk branch (guilhem.bichot:3348) Bug#11766522Guilhem Bichot26 Apr