Hi Jason!
The patch is formally correct, but since this code is called for each event, I
think it is more efficient and compact to add a member variable m_estimate, set
it to 0 when constructing the event, and then add the following code just after
unpack_current_row() in find_row() and write_row().
if (m_estimate == 0) {
m_estimate= (m_rows_end - m_curr_row) / (m_curr_row_end - m_curr_row);
m_table->file->ha_start_bulk_insert(m_estimate);
}
Find_row() covers both delete row and update row, while write_row() covers,
well, write row. For a better measure for the update event, you could modify the
estimate after the call to find_row() inside
Update_rows_log_event::do_exec_row() since you there have two records for each
row, but I don't think this is necessary.
That way we avoid unpacking the first row two times. This does not make a big
difference for big events with many small rows, but for small events with big
rows unpacking the row twice is a significant impact.
For example, unpacking a row with just a single insertion of a large BLOB (e.g.,
a document or a video) should not require unpacking the row twice since it will
affect the performance of the system significantly.
Just my few cents,
Mats Kindahl
He Zhenxing wrote:
> #At file:///media/sda3/work/mysql/bzrwork/b35843/5.1-rpl/
>
> 2666 He Zhenxing 2008-09-09
> BUG#35843 Slow replication slave when using partitioned myisam table
>
> In order to improve the performance when replicating to partitioned
> myisam tables with row-based format, a new function 'estimate_rows'
> is added for Rows_log_event to estimate the number of rows of current
> rows log event and use the estimated number to allocate cache memory.
> modified:
> sql/log_event.cc
> sql/log_event.h
>
> === modified file 'sql/log_event.cc'
> --- a/sql/log_event.cc 2008-08-06 10:41:27 +0000
> +++ b/sql/log_event.cc 2008-09-09 09:20:46 +0000
> @@ -7439,8 +7439,28 @@ Write_rows_log_event::Write_rows_log_eve
> #endif
>
> #if !defined(MYSQL_CLIENT) && defined(HAVE_REPLICATION)
> +/*
> + Estimate the number of rows of current event.
> +*/
> +ulong
> +Rows_log_event::estimate_rows(Relay_log_info const *rli)
> +{
> + const uchar *curr_row_end;
> + MY_BITMAP cols;
> + int const result= ::unpack_row(rli, m_table, m_width, m_curr_row,
> + &cols, &curr_row_end, 0);
> + if (curr_row_end > m_rows_end)
> + {
> + my_error(ER_SLAVE_CORRUPT_EVENT, MYF(0));
> + return 0;
> + }
> +
> + ulong rows= (m_rows_end - m_curr_row) / (curr_row_end - m_curr_row);
> + return rows;
> +}
> +
> int
> -Write_rows_log_event::do_before_row_operations(const Slave_reporting_capability
> *const)
> +Write_rows_log_event::do_before_row_operations(const Slave_reporting_capability
> *const log)
> {
> int error= 0;
>
> @@ -7488,7 +7508,10 @@ Write_rows_log_event::do_before_row_oper
> */
> }
>
> - m_table->file->ha_start_bulk_insert(0);
> + ulong rows= estimate_rows((Relay_log_info const*)log);
> + if (!rows)
> + return 1;
> + m_table->file->ha_start_bulk_insert(rows);
> /*
> We need TIMESTAMP_NO_AUTO_SET otherwise ha_write_row() will not use fill
> any TIMESTAMP column with data from the row but instead will use
>
> === modified file 'sql/log_event.h'
> --- a/sql/log_event.h 2008-07-31 06:24:27 +0000
> +++ b/sql/log_event.h 2008-09-09 09:20:46 +0000
> @@ -3428,6 +3428,16 @@ public:
>
> uint m_row_count; /* The number of rows added to the event */
>
> + /**
> + Estimate number of rows included in this Rows_log_event.
> +
> + @param rli Relay log info structure
> +
> + @retval >0 the number of rows in the event
> + @retval 0 the event is corrupted
> + */
> + ulong estimate_rows(Relay_log_info const *rli);
> +
> protected:
> /*
> The constructors are protected since you're supposed to inherit
>
>
--
Mats Kindahl
Lead Software Developer
Replication Team
MySQL AB, www.mysql.com