List:Commits« Previous MessageNext Message »
From:Luís Soares Date:April 23 2010 3:49pm
Subject:Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)
Bug#49100
View as plain text  
Hi Alfranio,

  please have a look at the patch (from 21st Apr.): 
  http://lists.mysql.com/commits/106164

  It should address all your comments.

Thanks,
Luís

On Fri, 2010-04-23 at 16:36 +0100, Alfranio Correia wrote:
> Hi Luis,
> 
> See some comments in-line.
> 
> Cheers.
> 
> Luís Soares wrote:
> > Hi Alfranio,
> > 
> >   Before proceeding some considerations about the server 
> > behavior before this patch and when tested against the
> > tests cases presented:
> > 
> > case #1: Master and slave(s) become out of sync.
> > 
> > case #2: Master and first slave in sync, but third slave is out
> >          of sync.
> > 
> > case #3: triggers assertion in 
> > #7  0x00000000007ef82d in Rows_log_event::unpack_current_row
> > (this=0x32f95d8, rli=0x331ac60, cols=0x32f9660, 
> >     abort_on_warning=false) at log_event.h:3631
> > 
> > case #4: although AI has data only for columns that slave does
> >          not know about (thence it should do nothing), it still 
> >          reports an error.
> > 
> > Last_SQL_Error	Could not execute Update_rows event on table test.t1;
> > Can't find record in 't1', Error_code: 1032; handler error
> > HA_ERR_END_OF_FILE; the event's master log mysqld-bin.000001,
> > end_log_pos 555
> > 
> > case #5: there is no bug here. We are just adding the test case
> >          to cover this case. In this case, despite the fact that 
> >          there is valid data for columns on the slave's table, the
> >          slave cannot find the record because there is no valid 
> >          data in the BI to use when searching for the record. 
> >          Thence we just test that the slave stops (by adding the 
> >          test case).
> > 
> > Some more comments inline.
> > 
> > On Tue, 2010-04-20 at 17:33 +0100, Alfranio Correia wrote:
> >> Hi Luis,
> >>
> >> Not approved.
> >>
> >> The code seems ok but we need to improve the comments and clarify
> >> some points. See comments in-line
> >>
> >> Cheers.
> >>
> >> Luis Soares wrote:
> >>> #At
> file:///home/lsoares/Workspace/bzr/work/features/wl5092/mysql-5.1-rpl-wl5092/ based on
> revid:luis.soares@stripped
> >>>
> >>>  3186 Luis Soares	2010-03-31
> >>>       BUG#49100: RBR: Unexpected behavior when AI contains no usable
> >>>                  data for slave columns
> >>>       
> >>>       RBR was not fully compliant with empty before and after
> >>>       images (BI and AI) in row events. Several faulty scenarios
> >>>       surfaced during WL#5092 implementation, especially when MINIMAL
> >>>       images were used:
> >>>       
> >>>         S1. Inserting an empty row on the master would cause slave to
> >>>             skip the event (empty AI);
> >>>       
> >>>             Expected: slave(s) shall apply the insert and fill its
> >>>             table with default values as master did.
> >>>       
> >>>         S2. Master inserts values just for columns that slave doesn't
> >>>             have (unusable AI at the slave). 
> >> Please, describe what was the bug. Most likely, it is equivalent to S1
> >> but it is better to make this clear.
> > 
> > No it's not the same thing. In S1 the event is skipped, while in 
> > S2 the event is not skipped.
> 
> As you said S1 and S2 go out of sync.
> This is exactly what I've requested to be added to the changeset comments and
> what I meant by equivalent.
> 
> > 
> >>>       
> >>>             Expected: slave(s) shall apply the insert and fill its
> >>>             table with default values as master did for columns that
> >>>             user did not provide data.
> >>>        
> >>>         S3. Master updates columns that only it knows about (unusable
> >>>             AI at the slave) but references columns that it shares with
> >>>             the slave (usable BI at the slave). This would trigger an
> >>>             assertion on a third slave in chained replication: m ->
> s1
> >>>             -> s2, s2 would abort.
> >>>       
> >>>             Expected: slave s1 shall not update any row and slave s2
> >>>             shall not assert.
> >>>       
> >>>         S4. Master updates and references columns that only it knows
> >>>             about (unusable AI and BI at the slave).
> >>
> >> The same here. See above.
> > 
> > No this (S4) is the case the master uses for reading and writing columns
> > that only it knows about. Whereas the previous (S3) is master reads
> > columns that the slave knows about but writes into columns that only
> > master is aware of...
> 
> I just asked to describe the bug. I did not say that they are equivalent.
> 
> "The same here. See above" means "please, describe the visible effects of the
> bug.", i.e. slave stops, crashes, whatever happens.
> 
> > 
> >>>       
> >>>             Expected: slave(s) shall skip the event and not update its
> >>>             tables with any value.
> >>>       
> >>>         S5. Master references columns that only it knows
> >>>             about (unusable BI at the slave) and updates columns shared
> >>>             with the slave (usable AI at the slave). Slave must stop.
> >>>       
> >> What was the error before the patch?
> > 
> > No error (in WL#5092 tree, have not tested in legacy 6.0), we're 
> > just adding test coverage in this patch.
> 
> 
> ok. Add this to the comments.
> 
> > 
> >>>             Expected: slave(s) shall fail with HA_ERR_END_OF_FILE.
> >>>       
> >>>       We fix S1 to S5 by relaxing some asserts that were triggered on
> >>>       when processing empty images. Additionally, on the
> >>>       Update_rows_log_event::do_exec_row we deployed a check the
> >>>       verifies if there is any value for AI. If there is not, then the
> >>>       event is skipped (neither BI nor AI is are processed).
> >> Are S3 and S4 only about asserts? I don't think so. Please, update the
> summary of
> >> the solutions.
> > 
> > They are completely different issues. S3 is about the assertion 
> > on the 2nd slave. While S4 is about the slave stopping unexpectedly. 
> 
> That's what I thought but reading your comments it seems that S1 - S5 are fixed
> by only relaxing assertions. So, I am asking to improve the comments.
> 
> > 
> >>>       
> >>>       For mysqlbinlog --verbose --base64=decode-rows we print out an
> >>>       INSERT INTO t VALUES () when a write rows event contains an
> >>>       empty AI.
> >> ok.
> >>
> >>>      @ mysql-test/suite/rpl/r/rpl_row_img.result
> >>>         Updates to result files.
> >>>      @ mysql-test/suite/rpl/t/rpl_row_img.test
> >>>         Added test cases.
> >>>      @ sql/log_event.cc
> >>>         1. Added printout for INSERT INTO ... VALUES () when AI is empty
> for
> >>>            Write_rows_log_events.
> >>>         2. Changed assertion from 'm_curr_row < ...' to 'm_curr_row
> <= ...'
> >>>         3. Calculation of estimated rows for bulk insert is now
> conditional,
> >>>            and dependent of whether m_curr_row < m_rows_end or not.
> >>>         4. In the Update_rows_log_event, changed it to take into
> consideration
> >>>            if there are any values in AI. If there are not, then skip
> event.
> >>>      @ sql/log_event.h
> >>>         Changed assertion to from 'm_curr_row < ...' to ' m_curr_row
> <= ...'
> >>>      @ sql/rpl_record.cc
> >>>         Do not unpack if master did not send any bit set in the
> write_set.
> >>>
> >>>     modified:
> >>>       mysql-test/suite/rpl/r/rpl_row_img.result
> >>>       mysql-test/suite/rpl/t/rpl_row_img.test
> >>>       sql/log_event.cc
> >>>       sql/log_event.h
> >>>       sql/rpl_record.cc
> >>> === modified file 'mysql-test/suite/rpl/r/rpl_row_img.result'
> >>> --- a/mysql-test/suite/rpl/r/rpl_row_img.result	2010-03-30 16:06:36
> +0000
> >>> +++ b/mysql-test/suite/rpl/r/rpl_row_img.result	2010-03-30 23:19:25
> +0000
> >>> @@ -24179,3 +24179,259 @@ FLUSH TABLES;
> >>>  SHOW VARIABLES LIKE 'binlog_row_image';
> >>>  Variable_name	Value
> >>>  binlog_row_image	FULL
> >>> +include/stop_slave.inc
> > 
> > [snip]
> > 
> >>> @@ -8695,7 +8706,11 @@ Rows_log_event::write_row(const Relay_lo
> >>>      /* this is the first row to be inserted, we estimate the rows with
> >>>         the size of the first row and use that value to initialize
> >>>         storage engine for bulk insertion */
> >>> -    ulong estimated_rows= (m_rows_end - m_curr_row) / (m_curr_row_end -
> m_curr_row);
> >>> +    DBUG_ASSERT(!(m_curr_row > m_curr_row_end));
> >>> +    ulong estimated_rows= 1;
> >>> +    if (m_curr_row < m_curr_row_end)
> >>> +      estimated_rows= (m_rows_end - m_curr_row) / (m_curr_row_end -
> m_curr_row);
> >>> +
> >>
> >> Please, rewrite it as follows:
> >>
> >>
> >> +    if (m_curr_row < m_curr_row_end)
> >> +      estimated_rows= (m_rows_end - m_curr_row) / (m_curr_row_end -
> m_curr_row);
> >> +    else if (m_curr_row = m_curr_row_end)
> >> +      ulong estimated_rows= 1;
> >>
> >>
> >> It think it would be easier to read.
> > 
> > OK, will do, but will change 
> >   ulong estimated_rows= 1; 
> > to 
> >   estimated_rows= 1;
> > in the last line. ;)
> 
> yeap.
> 
> > 
> >>>      m_table->file->ha_start_bulk_insert(estimated_rows);
> >>>    }
> >>>    
> >>> @@ -9665,8 +9680,34 @@ int 
> >>>  Update_rows_log_event::do_exec_row(const Relay_log_info *const rli)
> >>>  {
> >>>    DBUG_ASSERT(m_table != NULL);
> >>> +  int error= 0;
> >>> +
> >>> +  /* check if update contains only values (for both BI and AI)
> >>> +     for columns that do not exist on the slave. If they do, we
> >>> +     can just unpack the rows and return (do nothing on the local 
> >>> +     table).
> >>> +
> >>> +     We can optimize and check only if there are usable values on 
> >>> +     the AI. However, if no good values on AI but we have values 
> >>> +     on BI, we will not be letting the slave thread to search the
> >>> +     SE for the row. This means that we may be loosing a bit on 
> >>> +     the safety side!
> >>> +     
> >>> +     After all, find_row is a consistency check as well.
> >>> +  */
> >> These comments are difficult to read. Please, improve them.
> >> Does this handle S3 and S4?
> > 
> > S4. I think you refer to the second paragraph as being hard 
> > to read...
> > 
> > Will improve it.
> > 
> >>> +  if (/*!is_any_column_signaled_for_table(m_table, &m_cols)
> &&*/ /* BI */
> >>> +      !is_any_column_signaled_for_table(m_table, &m_cols_ai) /* AI
> */)
> >>> +  {
> >>> +    /* read and discard images, because they don't contain any 
> >>> +       useful information for this server table */
> >>> +    error = unpack_current_row(rli, &m_cols);
> >>> +    m_curr_row= m_curr_row_end;
> >>> +    error = error | unpack_current_row(rli, &m_cols_ai);
> >>> +
> >>> +    return error;
> >>> +  }
> >> See above.
> > 
> > What should I see? Is it hard to read? Do you want me to remove
> > the comment in the 'if' ?
> > 
> > I'll assume that you want me to remove the comment on the 'if'.
> > I only added for reviewing purposes anyway...
> > 
> 
> ok.
> 
> 
> >>>  
> >>> -  int error= find_row(rli); 
> >>> +  error= find_row(rli); 
> >>>    if (error)
> >>>    {
> >>>      /*
> >> ok.
> >>
> >>> === modified file 'sql/log_event.h'
> >>> --- a/sql/log_event.h	2010-03-04 17:00:32 +0000
> >>> +++ b/sql/log_event.h	2010-03-30 23:19:25 +0000
> >>> @@ -3628,7 +3628,7 @@ protected:
> >>>      DBUG_ASSERT(m_table);
> >>>  
> >>>      bool first_row= (m_curr_row == m_rows_buf);
> >>> -    ASSERT_OR_RETURN_ERROR(m_curr_row < m_rows_end,
> HA_ERR_CORRUPT_EVENT);
> >>> +    ASSERT_OR_RETURN_ERROR(m_curr_row <= m_rows_end,
> HA_ERR_CORRUPT_EVENT);
> >>>      int const result= ::unpack_row(rli, m_table, m_width, m_curr_row,
> cols,
> >>>                                     &m_curr_row_end,
> &m_master_reclength,
> >>>                                     abort_on_warning, first_row);
> >> ok.
> >>
> >>> === modified file 'sql/rpl_record.cc'
> >>> --- a/sql/rpl_record.cc	2010-03-04 17:00:32 +0000
> >>> +++ b/sql/rpl_record.cc	2010-03-30 23:19:25 +0000
> >>> @@ -208,6 +208,20 @@ unpack_row(Relay_log_info const *rli,
> >>>    uchar const *null_ptr= row_data;
> >>>    uchar const *pack_ptr= row_data + master_null_byte_count;
> >>>  
> >>> +  if (bitmap_is_clear_all(cols))
> >>> +  {
> >>> +    /**
> >>> +       There was no data sent from the master, so there is 
> >>> +       nothing to unpack. In case we are unpacking an AI (for
> >>> +       update or insert purposes), we just use the default 
> >> Is this comment right? Default value for an update?
> > 
> > On an update one unpacks the row on top of the record that 
> > the engine has returned in find_row. See:
> > 
> > Update_rows_log_event::do_exec_row
> > 
> > So this comment is misleading...
> > 
> >>> +       values from the slave to do the insert.
> >> Updates ---> Inserts?
> >>
> >> Please, improve these comments.
> > 
> > Again, these comments are misleading... Perhaps the reference 
> > to AI is probably not worth the trouble... I'll remove it.
> > 
> > As I take it code-wise (apart from the refactoring above in one 'if')
> > the patch is ok, you just need me to improve some comments.
> > 
> > Will rewrite the comments do the small change and recommit.
> 
> 
> ok.
> 
> > 
> > Thanks,
> > Luís
> > 
> >>> +    */
> >>> +    *row_end= pack_ptr;
> >>> +    *master_reclength= 0;
> >>> +    DBUG_RETURN(error);
> >>> +  }
> >>> +
> >>> +
> >>>    Field **const begin_ptr = table->field;
> >>>    Field **field_ptr;
> >>>    Field **const end_ptr= begin_ptr + colcnt;
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------
> >>>
> >>> This body part will be downloaded on demand.
> >> -- 
> >> MySQL Code Commits Mailing List
> >> For list archives: http://lists.mysql.com/commits
> >> To unsubscribe:    http://lists.mysql.com/commits?unsub=1
> >>
> > 
> > 


Thread
bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Luis Soares31 Mar
  • Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Alfranio Correia20 Apr
    • Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Luís Soares21 Apr
      • Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Luís Soares21 Apr
      • Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Alfranio Correia23 Apr
        • Re: bzr commit into mysql-5.1-rpl-wl5092 branch (luis.soares:3186)Bug#49100Luís Soares23 Apr