List:Commits« Previous MessageNext Message »
From:Rafal Somla Date:November 14 2007 10:47am
Subject:Re: bk commit into 5.2 tree (rafal:1.2611) WL#4063
View as plain text  
Hi Lars,

Thanks for the comments. I already implemented some of the Chuck's review 
remarks and few of yours. After finishing with this I need to merge the code 
with the current 6.0-backup tree (it should be trivial as it consists of new 
files mainly) and the patch will be ready to push!

Lars Thalmann wrote:
> Hi Rafal,
> 
> Wow, this was a great patch.  And big too. :)
> 

Thanks!

> It took longer than I expected to review it, but these are some
> preliminary comments.  I'll continue tomorrow.
> 
> Since this a big and important module in the backup framework, I had
> quite some comments.  We can discuss them tomorrow, but probably good
> if you start fixing the ones you agree on right away so that we don't
> loose time. 
> 
> I think you need to fix these
> -----------------------------
> - Remove use of assert() and use DBUG_ASSERT instead.
> 

This and similar requests boil down to the following question (which we should 
perhaps discuss during backup call):

   Do we want to make the backup stream library code independent from the rest of
   the mysql server tree or not?

By "independent" I mean that one can download the source of the library and 
compile it without a need to have any of the mysql libraries or source code 
installed on his computer.

I think we should try to do this, because the library is supposed to be used by 
external programs for reading and analysing the streams produced by our online 
backup system. If possible to write the library so that it is stand-alone, why 
not doing it? It has a clear advantage over the other alternative.

Now, using DBUG_ASSERT, mysys memory allocating functions or any similar things 
will create a dependecy on the mysql tree and make it impossible to compile and 
use the library on its own. I definitely think that this would be worse 
situation than the current one.

> - Make all functions like "bs_rd_int2" static, so that no *user*
>   of the stream library can use them.  Remove them from .h file.
> 

I removed bs*_wr_*() functions but left the functions for reading. Rationale:

   While format in which stream is written should be strict, it is safe and
   potentially useful to let clients read various parts of the stream themselves.
   If doing so, the clients will take responsibility for reading things in
   correct order but they still can't produce wrongly formatted stream. If a
   client doesn't know internal details of the format, it should read stream
   using the high-level functions which know it.

> - In places like "struct st_bs_time" do not use types with non-fixed
>   size (different compilers define types differently), such as
>   "unsigned int".  We need the image format to have defined size of
>   each data.
> 
>   There is actually a choice if you want to fix type size in the
>   kernel-stream interface, or inside the stream module.  You need to
>   store things endian-independent anyway inside the stream module.

I went for the second option. That is, on the level of C interface type sizes 
may vary from platform to platform but the code writes all the values always in 
the same format and translates accordingly during read.

> 
> - Verify that code such as "buf[0]= (time->year>>4) & 0xFF; buf[1]=
>   ((time->year<<4) & 0xF0) | (time->mon &0x0F);" is
>   endian-independent.  The image format needs to be stored in
>   network-independent format, compare with log_event.cc and
>   int2store() functions etc.
>

I was aware of this requirement when writing the code. Do you see any concrete 
problems in that respect?

> - I think that summary could be 1) in preamble, 2) in postamble, or 3)
>   in both.
> 

Good idea. How about always appending it at the end of the stream and optionally 
store also in the preamble. Current code supports reading of summary stored in 
the preamble, but it never writes it there (always at the end). Given this 
limitation, the code already supports (in the sense of being able to read) 
streams which, as you suggest, have summary both in the preamble and at the end.

> - Why is there both binlog pos and binlog group?  With the commit
>   blocker, there should only be group pos.
> 

Are we sure of that? If yes I can remove the binlog pos. Let's discuss it during 
backup call.

> - What is the point of snap_count?  And what is the definition of a
>   table data snapshot, in this context?

Table data snapshot is the thing produced by backup driver and read by restore 
driver. I don't call it image because this name is reserved for the whole backup 
data. The point of snap_count: to know how many snapshots are included in the 
image. E.g., when reading the image, we need to create a restore driver for each 
snapshot present there.

> 
> - I think we can easily have more than 256 item/object types, so
>   perhaps use #define for this and allocate 2 bytes (perhaps even 3 or
>   4 bytes, one never know if a some flags are needed later).
> 

I'm not so sure - I think it is safe to assume that mysql server doesn't have 
more than 256 types of objects (such as tables, databases, views, users etc.). 
However, it is possible that in the future we could use some internal object 
types not present in the server. I would leave extending this for the time when 
we really need it (then we will step the stream format version). But if you 
insist I can change it to 2 bytes now - this will affect size of the preamble 
and this should not affect too much the overall size of the image.


> - In enums with values that are stored on disk, use "NAME = 47"
>   syntax and don't let the computer pick the number.  The enum values
>   should be the same on different machines, but I've seen several
>   times bugs when people are maintaining the code.  Also, auto-merges
>   can screw up the values, so one should always explicitly set the 
>   value to be on the safe side (if value is sent on net/stored on
>   disk).  But, I'm not even sure we should use the enum at all,
>   but instead use #define.
> 

I never write enum values to the stream. There are functions which explicitly 
translate internal enum values into concrete numbers written to the stream 
(e.g., bstream_wr_item_type()).

> - Let an item either belong to a "category" or have a "type".  No need
>   for both types of concepts (it is complex enough anyway :).
> 

Can't be done: for example category of "global items" contains items of 
different types (databases, users, privileges). So item like a database has type 
"database" *and* belongs to category "global items" - skipping one of these will 
be loosing information which is needed in the code.

> - Split header file into a catalog service interface and a backup
>   stream interface.  As it is now, it is confusing since functions
>   "go" in two directions (see comment below on dependencies).
> 

Done: the catalogue (and other) services are declared in stream_v1_services.h.

> - Move abstract stream interface and low-lowel i/o interface away from
>   interface header file.  I think the mysqlbackup client program
>   should only have to include the interface header file and that this
>   header file should be "minimal".
> 

Yes, I removed low-level functions from the header.

> - Use mysys for memory allocation.  The overall memory consumption
>   should be controlled in one place.
> 

See my first comment.

> - Is it ensured that a bad block does not ruin the whole backup?
>   (Unless it is a a header block).  I.e. that the corrept block can
>   be ignored but the rest can still be restored e.g without some 
>   engine subimage?
> 

The format is designed to make it possible to detect and survive such errors. 
But current code doesn't do any kind of error recovery. As soon as an error is 
detected, functions will report it and stream will enter "error state" - no 
further reading/writing will be possible.

> Other comments/suggestions
> --------------------------

I'll deal with these in next iteration.

Rafal

> - s/suhc/such/
> 
> - "These functions are used for memory (de)allocation."
>   s/Module using this library should define them/
>     Any program using this module, need to have these defined/
> 
>   The point is that the kernel is dependent on stream which is
>   dependent on memory allocation.  There is no need that the *kernel*
>   provide memory, it could be some other module that is linked with 
>   the kernel and stream module .  I think this way of seeing things:
>   1) Is easier to understand
>   2) Is more flexible
> 
>   And, in the same way: "backup kernel" depend on "stream" which
>   depend on "catalog service interface".  The latter is usually
>   provided by the kernel, but need not to be (e.g. for exotic
>   testing, I think you did something like this already?).
> 
> - Default block size can probably be at least 32 kb.
> 
> - s/BASE TYPES/BASIC TYPES/
> 
> - s/main parts/main chunks/   (Lets try to minimize number of new
>   concepts and only use "chunk")
> 
> - I wonder if the concept set { part, chunk, fragment, block,
>   preamble, catalog, snapshot, header, fragment header } can be
>   simplified.
> 
> - Why is the name of the storage engine a blob and not a string? 
>   Same question for version "extra".
> 
> - Some file like "stream_v1.h" should be the definition of the API
>   between kernel and stream.  (If we change the image format, to new
>   version, we might still use the same api).
> 
> - In the header file, move the simpler type to start of file to make
>   it easier to read it.
> 
> - The code is prepared so that we can have different error codes in
>   the future.  I think this is were we need to go mid-term, so this is
>   excellent.
> 
> - I like to call the atomic parts in the backup "objects".  I've asked
>   Robin to comment on what he thinks we should call them.  Items are
>   probably fine though.
> 
> - I don't think there is a point in having "v1" in the file names.  We
>   don't know how the image format will evolve and it is easy to rename
>   files later if needed.
> 
> Best wishes,
> Lars
> 
> 
> On Fri, Oct 26, 2007 at 02:02:11PM +0200, rsomla@stripped wrote:
>> Below is the list of changes that have just been committed into a local
>> 5.2 repository of rafal. When rafal does a push these changes will
>> be propagated to the main repository and, within 24 hours after the
>> push, to the public repository.
>> For information on how to access the public repository
>> see http://dev.mysql.com/doc/mysql/en/installing-source-tree.html
>>
>> ChangeSet@stripped, 2007-10-26 14:02:05+02:00, rafal@quant.(none) +5 -0
>>   WL#4063 (Online Backup: Finalize backup stream format)
>>   
>>   This patch introduces the backup stream library for reading/writing backup 
>>   image using version 1 of the format. The format is documented in WL#4063 and
>>   in source comments.
>>
>>   sql/backup/CMakeLists.txt@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +3
> -0
>>     Added definition of backupstream library.
>>
>>   sql/backup/Makefile.am@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +43
> -36
>>     Fixed spacing and added definition of backupstream library.
>>
>>   sql/backup/stream_v1.c@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +1831
> -0
>>     Implementation of backup stream API - high level functions.
>>
>>   sql/backup/stream_v1.c@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +0
> -0
>>
>>   sql/backup/stream_v1.h@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +644
> -0
>>     Backup stream API (types and functions).
>>
>>   sql/backup/stream_v1.h@stripped, 2007-10-26 14:02:02+02:00, rafal@quant.(none) +0
> -0
>>
>>   sql/backup/stream_v1_transport.c@stripped, 2007-10-26 14:02:02+02:00,
> rafal@quant.(none) +1228 -0
>>     Implementation of backup stream API - transport layer.
>>
>>   sql/backup/stream_v1_transport.c@stripped, 2007-10-26 14:02:02+02:00,
> rafal@quant.(none) +0 -0
> 
Thread
Re: bk commit into 5.2 tree (rafal:1.2611) WL#4063Rafal Somla14 Nov
  • Re: bk commit into 5.2 tree (rafal:1.2611) WL#4063Rafal Somla14 Nov