Hi Rafal!
I've made the changes as you suggested except for changing the signature of
unpack_row() as we discussed. Here is the latest patch for your review. I've
made some comments below as well.
http://lists.mysql.com/commits/35939
Let me know if you have any questions.
Thanks,
Chuck
> -----Original Message-----
> From: Rafal Somla [mailto:rsomla@stripped]
> Sent: Tuesday, October 16, 2007 5:20 AM
> To: cbell@stripped
> Cc: commits@stripped
> Subject: Re: bk commit into 5.2 tree (cbell:1.2606) BUG#31538
>
> Chuck,
>
> I have few comments below. I think it would be also good to
> change the signature of unpack_row() as I've discussed with
> Mats in my previous emails:
>
> Rafal wrote:
> > I have a proposition to remove the rli parameter and
> replace it by (pointer to) table_def structure:
> >
> > unpack_row(table_def const *tabledef,
> > TABLE *table, uint const colcnt,
> > uchar const *const row_data, MY_BITMAP const *cols,
> > uchar const **const row_end, ulong *const
> > master_reclength)
> >
> > With documentation along these lines:
> >
> > /**
> > ...
> > @param tabledef Definition of the table from which the
> packed row comes
> > @param table Table to unpack into
> > ...
> >
> > @note table should be compatibile with tabledef. If
> tabledef is NULL
> > then the definition of table (stored in the TABLE
> structure) is used.
> > ...
> > */
> >
>
> (maybe second sentence in the note will be better to
> formulate like this:
> "If tabledef is NULL then it is assumed that the packed row
> comes from the
> table to which it is unpacked")
Note: I've changed the documentation to make a bit more sense.
>
> Mats has agreed with these changes so let's use that occasion
> to improve the interface.
>
> cbell@stripped wrote:
> > Below is the list of changes that have just been committed into a
> > local
> > 5.2 repository of cbell. When cbell does a push these
> changes will be
> > propagated to the main repository and, within 24 hours
> after the push,
> > to the public repository.
> > For information on how to access the public repository see
> > http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
> >
> > ChangeSet@stripped, 2007-10-11 17:37:01-04:00,
> cbell@mysql_cab_desk. +5 -0
> > BUG#31538 Backup file is too large.
> >
> > This patch alters the internal row format of the default
> and consistent snapshot driver to use
> > the same format as that of the binary log (RBR).
> >
> > mysql-test/r/backup.result@stripped, 2007-10-11
> 17:36:48-04:00, cbell@mysql_cab_desk. +8 -8
> > BUG#31538 Backup file is too large.
> >
> > New result file needed because size of backup file changed.
> >
> > mysql-test/r/backup_snapshot.result@stripped, 2007-10-11
> 17:36:49-04:00, cbell@mysql_cab_desk. +4 -4
> > BUG#31538 Backup file is too large.
> >
> > New result file needed because size of backup file changed.
> >
> > sql/backup/be_default.cc@stripped, 2007-10-11 17:36:50-04:00,
> cbell@mysql_cab_desk. +80 -8
> > BUG#31538 Backup file is too large.
> >
> > This patch adds a pack method for backup and an unpack
> method for restore. These methods
> > encapsulate the code needed to access and use the
> pack_row and unpack_row replication methods
> > that generate the binary log format.
> >
> > sql/backup/be_default.h@stripped, 2007-10-11 17:36:51-04:00,
> cbell@mysql_cab_desk. +4 -0
> > BUG#31538 Backup file is too large.
> >
> > Added method declarations for new pack and unpack methods.
> >
> > sql/rpl_record.cc@stripped, 2007-10-11 17:36:50-04:00,
> cbell@mysql_cab_desk. +14 -3
> > BUG#31538 Backup file is too large.
> >
> > This patch changes the unpack_row() method to allow it
> to be used when there is no tabledef or
> > relay log info structure. In that case, the extra cols
> on master additions are ignored.
> >
>
> > diff -Nrup a/sql/backup/be_default.cc b/sql/backup/be_default.cc
> > --- a/sql/backup/be_default.cc 2007-07-02 13:42:58 -04:00
> > +++ b/sql/backup/be_default.cc 2007-10-11 17:36:50 -04:00
> > @@ -67,6 +67,7 @@
> > #include "backup_engine.h"
> > #include "be_default.h"
> > #include "backup_aux.h"
> > +#include "rpl_record.h"
> >
> > namespace default_backup {
> >
> > @@ -200,6 +201,38 @@ int Backup::next_table()
> > DBUG_RETURN(0);
> > }
> >
> > +#define BITMAP_STACKBUF_SIZE (128/8)
>
> Please document the constant.
>
> > +
> > +/**
> > + * @brief Pack the data for a row in the table.
> > + *
> > + * This method uses the binary log methods to pack a row from the
> > + * internal row format to the binary log format.
> > + *
> > + * @returns Size of packed row.
> > + */
> > +uint Backup::pack(byte *rcd, byte *packed_row) {
> > + uint size= 0;
> > + int error= 0;
> > +
> > + DBUG_ENTER("Default_backup::pack_row(byte *rcd, byte
> > +*packed_row)");
> > + if (cur_table)
> > + {
> > + MY_BITMAP cols;
> > + /* Potential buffer on the stack for the bitmap */
> > + uint32 bitbuf[BITMAP_STACKBUF_SIZE/sizeof(uint32)];
> > + uint n_fields= cur_table->s->fields;
> > + my_bool use_bitbuf= n_fields <= sizeof(bitbuf) * 8;
> > + error= bitmap_init(&cols, use_bitbuf ? bitbuf : NULL,
> (n_fields + 7) & ~7UL, FALSE);
> > + bitmap_set_all(&cols);
> > + size= pack_row(cur_table, &cols, packed_row, rcd);
> > + if (!use_bitbuf)
> > + bitmap_free(&cols);
> > + }
> > + DBUG_RETURN(size);
> > +}
> > +
> > /**
> > * @brief Get the data for a row in the table.
> > * This method is the main method used in the backup
> operation. It
> > is @@ -316,16 +349,21 @@ result_t Backup::get_data(Buffer &buf)
> > if ((size + META_SIZE) <= buf.size)
> > {
> > *buf.data= RCD_ONCE; //only part 1 of 1
> > - memcpy((byte *)buf.data + META_SIZE,
> cur_table->record[0], size);
> > - buf.size = size + META_SIZE;
> > + int packed_size= 0;
> > + packed_size= pack(cur_table->record[0], buf.data +
> META_SIZE);
> > + buf.size = packed_size + META_SIZE;
>
> I'm concerned about buffer overrun. I think pack just assumes
> that there is enough space to fit the row, but are you
> checking that you have enough free space in buf? As it is not
> known how long the packed row will be (I think) one probably
> needs a safe estimete like table->rec_length + some margin.
Ok. But size = reclength in a previous statement. So the condition (size +
META_SIZE) <= buf.size checks for overrun in the buffer. The pack_row will
create a packed row that is <= the original size so this check is sufficient
I think.
>
> > mode= CHECK_BLOBS;
> > }
> > else
> > {
> > size_t rec_size= 0;
> > byte *rec_ptr= 0;
> > + byte *packed_ptr= 0;
> > + int packed_size= 0;
> >
> > - rec_buffer.initialize(cur_table->record[0], size);
> > + rec_buffer.initialize(size);
> > + packed_ptr= rec_buffer.get_base_ptr();
> > + packed_size= pack(cur_table->record[0], packed_ptr);
> > rec_size= rec_buffer.get_next((byte **)&rec_ptr,
> > (buf.size - META_SIZE));
> > *buf.data= RCD_FIRST; // first part @@ -570,6
> +608,40 @@ int
> > Restore::next_table() }
> >
> > /**
> > + * @brief Unpack the data for a row in the table.
> > + *
> > + * This method uses the binary log methods to unpack a
> row from the
> > + * binary log format to the internal row format.
> > + *
> > + * @retval 0 no errors.
> > + * @retval !0 errors during unpack_row().
> > + */
> > +uint Restore::unpack(byte *packed_row) {
> > + uint size= 0;
> > + int error= 0;
> > + const uchar *cur_row_end;
> > +
> > + DBUG_ENTER("Default_backup::unpack(byte *packed_row,
> byte *rcd)");
> > + if (cur_table) {
> > + MY_BITMAP cols;
> > + /* Potential buffer on the stack for the bitmap */
> > + uint32 bitbuf[BITMAP_STACKBUF_SIZE/sizeof(uint32)];
> > + uint n_fields= cur_table->s->fields;
> > + my_bool use_bitbuf= n_fields <= sizeof(bitbuf) * 8;
> > + error= bitmap_init(&cols, use_bitbuf ? bitbuf : NULL,
> (n_fields + 7) & ~7UL, FALSE);
> > + bitmap_set_all(&cols);
> > + ulong length;
> > + error= unpack_row(NULL, cur_table, n_fields,
> packed_row, &cols, &cur_row_end, &length);
> > + size= (uint)length;
>
> I understand that you need 'length' variable to call
> unpack_row, but what is 'size' for?
>
> > + if (!use_bitbuf)
> > + bitmap_free(&cols);
> > + }
> > + DBUG_RETURN(error);
> > +}
> > +
> > +/**
> > * @brief Restore the data for a row in the table.
> > *
> > * This method is the main method used in the restore
> operation. It
> > is @@ -692,14 +764,14 @@ result_t Restore::send_data(Buffer &buf)
> > */
> > case RCD_ONCE:
> > {
> > - if (size == cur_table->s->reclength)
> > + uint error= unpack((byte *)buf.data + META_SIZE);
> > + if (error)
> > + DBUG_RETURN(ERROR);
> > + else
> > {
> > - memcpy(cur_table->record[0], (byte *)buf.data +
> META_SIZE, size);
> > mode= CHECK_BLOBS;
> > DBUG_RETURN(PROCESSING);
> > }
> > - else
> > - DBUG_RETURN(ERROR);
> > }
> >
> > /*
> > @@ -730,7 +802,7 @@ result_t Restore::send_data(Buffer &buf)
> > {
> > rec_buffer.put_next((byte *)buf.data + META_SIZE, size);
> > ptr= (byte *)rec_buffer.get_base_ptr();
> > - memcpy(cur_table->record[0], ptr, cur_table->s->reclength);
> > + unpack(ptr);
> > rec_buffer.reset();
> > mode= CHECK_BLOBS;
> > }
>
> > diff -Nrup a/sql/backup/be_default.h b/sql/backup/be_default.h
> > --- a/sql/backup/be_default.h 2007-07-02 13:42:58 -04:00
> > +++ b/sql/backup/be_default.h 2007-10-11 17:36:51 -04:00
> > @@ -136,6 +136,8 @@ class Backup: public Backup_driver
> > byte *ptr; ///< Pointer to blob
> data from record.
> > TABLE_LIST *all_tables; ///< Reference to list
> of tables used.
> > TABLE_LIST *tables_in_backup; ///< List of tables
> used in backup.
> > +
> > + uint pack(byte *rcd, byte *packed_row);
> > };
> >
> > /**
> > @@ -193,6 +195,8 @@ class Restore: public Restore_driver
> > int blob_ptr_index; ///< Position in blob
> pointer list
> > THD *m_thd; ///< Pointer to current
> thread struct.
> > TABLE_LIST *all_tables; ///< Reference to list
> of tables used.
> > +
> > + uint unpack(byte *packed_row);
> > };
> > } // default_backup namespace
> >
>
> > diff -Nrup a/sql/rpl_record.cc b/sql/rpl_record.cc
> > --- a/sql/rpl_record.cc 2007-08-29 17:28:34 -04:00
> > +++ b/sql/rpl_record.cc 2007-10-11 17:36:50 -04:00
> > @@ -208,7 +208,14 @@ unpack_row(Relay_log_info const *rli,
> > // The "current" null bits
> > unsigned int null_bits= *null_ptr++;
> > uint i= 0;
> > - table_def *tabledef= ((Relay_log_info*)rli)->get_tabledef(table);
> > +
> > + /*
> > + Allow the use of this function without a tabledef.
> This will allow calls to
> > + unpack a row for an existing table, e.g. in online
> backup drivers.
> > + */
>
> If we change signature of unpack_row, the comment should be
> updated to be consistent.
>
> > + table_def *tabledef= 0;
> > + if (rli)
> > + tabledef= ((Relay_log_info*)rli)->get_tabledef(table);
>
> Why not like this:
>
> table_def *tabledef= rli? ... : NULL;
>
> then you don't need if statement.
>
> > for (field_ptr= begin_ptr ; field_ptr < end_ptr &&
> *field_ptr ; ++field_ptr)
> > {
> > Field *const f= *field_ptr;
> > @@ -246,7 +253,9 @@ unpack_row(Relay_log_info const *rli,
> > bool save= table->s->db_low_byte_first;
> > table->s->db_low_byte_first= TRUE; #endif
> > - uint16 const metadata= tabledef->field_metadata(i);
> > + uint16 metadata= 0;
> > + if (tabledef) // if there is a
> tabledef passed in, use it.
> > + metadata= tabledef->field_metadata(i);
>
> Using this construct
>
> uint16 const metadata= tabledef? 0 : .... ;
>
> allows to preserve the const declaration.
>
> > if (tabledef && metadata)
> > pack_ptr= f->unpack(f->ptr, pack_ptr, metadata);
> > else
> > @@ -264,7 +273,9 @@ unpack_row(Relay_log_info const *rli,
> > /*
> > throw away master's extra fields
> > */
> > - uint max_cols= min(tabledef->size(), cols->n_bits);
> > + uint max_cols= 0;
> > + if (tabledef)
> > + max_cols= min(tabledef->size(), cols->n_bits);
> > for (; i < max_cols; i++)
> > {
> > if (bitmap_is_set(cols, i))
> >
> >
>
> Rafal
>