List:Commits« Previous MessageNext Message »
From:Øystein Grøvlen Date:December 11 2008 2:24pm
Subject:Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)
WL#4630
View as plain text  
Ingo,

Thanks, for this thorough and well tested patch.  I have not been
through all of it, yet, but here are the comments and questions I have
so far:

  * The name: With a client that is called mysqlbackup, I would expect
    to be able to perform backup.  Has another name been considered?

  * Some questions from playing a bit with the client:

      - What does "extra data length" mean?  And why is it always 0?
      - Finish time is always 0.  Why?  It is set correctly in the
        backup_history table for the same backup.
      - If I use data-chunks option, there are several entries for each
        chunk:
          Data chunks:
            Chunk 1 has 29 bytes for table 'test'.'t'
            Chunk 1 has 1025 bytes for table 'test'.'t'
        This seems a bit strange.  Is it correct?
      - What is --exact supposed to do?

  * All this error injection seems a bit of an overkill for a client
    since in the event of failure all you need is to be able to stop
    and report the error.  For a server that need to keep on going, I
    see that it is necessary to test that errors does not take it down.

  * I am a bit puzzled as to why we need interface code on both side
    the fence here.  stream_backup.c contains several functions that in
    effect does nothing, but call functions in stream_v1.c.  (E.g.,
    backup_read_summary is one single method call plus 20 lines of
    initialization, debug code, and error testing.)  Why could not
    mysqlbackup.cc use stream_v1.c directly?  Then error injection and
    checking could be put there so that also the server could benefit
    from it.  (This is maybe not my business since I am not the
    architectural reviewer.)

  * Took me some time to understand the role of the bcat_ functions.  I
    think this should be explained a bit better.  Do the implementation
    have to be this separate for the client and the server, or would it
    be possible to share some more code?

  * Is it safe to extend the structs like st_bstream_image_header?  Are
    you guaranteed that the stream library will never make any
    assumptions about the size of the struct?

  * I would prefer if the addition of error injection to mysys routines
    was done as a separate patch.  This is a general mechanism and I
    feel that is not related to backup.  I also think other people than
    members of the backup team should review this.

  * The code contains a lot of hard-coded English messages.  Will there
    not be a requirement to internationlize this?


Some more detailed questions/comments inline:

...

 > === modified file '.bzrignore'
 > --- a/.bzrignore	2008-10-29 14:20:12 +0000
 > +++ b/.bzrignore	2008-11-25 19:33:34 +0000
 > @@ -314,6 +314,10 @@ client/mysql_upgrade
 >  client/mysqladmin
 >  client/mysqladmin.c
 >  client/mysqladmin.cpp
 > +client/mysqlbackup
 > +client/stream_v1.h
 > +client/stream_v1_services.h
 > +mysql-test/mysqlbackup-test.bak

Why are the three last entries added?

...

 > +/*
 > +  Error injection. Built on DBUG.
 > +  This is similar to error injection in the server.
 > +  In case that one becomes globally available, we undefine it first.
 > +*/
 > +#ifdef ERROR_INJECT
 > +#undef ERROR_INJECT
 > +#endif
 > +#define ERROR_INJECT(_keyword_, _action_) \
 > +  DBUG_EXECUTE_IF((_keyword_), errm("ERROR_INJECT(\"%s\")\n", 
(_keyword_)); \
 > +                  DBUG_PRINT("bupstrm", ("ERROR_INJECT(\"%s\")\n", 
         \
 > +                                         (_keyword_))); _action_)

Could not this be a general macro and not particular to this file?

 > +
 > +/* For easier error tracking we do not pre-allocate in DBUG mode. */
 > +#if !defined(DBUG_OFF)
 > +#define DYN_ALLOC_INIT 0
 > +#define DYN_ALLOC_INCR 1
 > +#else
 > +#define DYN_ALLOC_INIT 1000
 > +#define DYN_ALLOC_INCR 1000
 > +#endif

Why is this done?

 > +
 > +/*
 > +  ====================================================
 > +  Convenience functions for access to st_blob objects.
 > +  ====================================================
 > +*/
 > +
 > +/**
 > +  Backup blob length.
 > +
 > +  This is a function, not a macro, for better type checking.
 > +
 > +  @param[in]    blob            blob
 > +
 > +  @return       length
 > +*/
 > +
 > +ulong BBL(struct st_blob *blob)

Why name it like a macro, when it is a function?

 > +{
 > +  DBUG_ASSERT(blob);
 > +  DBUG_ASSERT(blob->end >= blob->begin);
 > +  return ((ulong) (blob->end - blob->begin));
 > +}
 > +
 > +
 > +/**
 > +  Backup blob string.
 > +
 > +  This is a function, not a macro, for better type checking.
 > +
 > +  @param[in]    blob            blob
 > +
 > +  @return       string
 > +*/
 > +
 > +char *BBS(struct st_blob *blob)

Ditto.

 > +{
 > +  DBUG_ASSERT(blob);
 > +  DBUG_ASSERT(blob->end >= blob->begin);
 > +  return ((char*) blob->begin);
 > +}
 > +
 > +
 > +/*
 > +  ====================================================
 > +  Memory allocation.
 > +  These are callback functions for the stream library.
 > +  ====================================================
 > +*/
 > +
 > +/**
 > +  Allocate given amount of memory and return pointer to it.
 > +
 > +  @param[in]    size            amount of memory to allocate
 > +
 > +  @return       pointer to allocated memory
 > +*/
 > +
 > +bstream_byte* bstream_alloc(unsigned long int size)
 > +{
 > +  return my_malloc(size, MYF(MY_WME));
 > +}
 > +
 > +
 > +/**
 > +  Free previously allocated memory.
 > +
 > +  @param[in]    ptr             pointer to allocated memory
 > +*/
 > +
 > +void bstream_free(bstream_byte *ptr)
 > +{
 > +  my_free(ptr, MYF(MY_ALLOW_ZERO_PTR));
 > +}
 > +

It seems a bit strange that while I can't find that bstream_alloc is
used anywhere, bstream_free is used in several places.  Should not
these come in pairs?  There is little backup specific about the these
functions.  Are they actually needed, and if yes, why not make the
functionality generally available?


 > +
 > +/*
 > +  ====================================================
 > +  Helper functions for the stream library.
 > +  These are callback functions for the stream library.
 > +  Low-level stream access.
 > +  ====================================================
 > +*/
 > +
 > +/*
 > +  Helper struct for stream access functions.
 > +
 > +  The stream access functions do not read the backup image file
 > +  directly. They call back functions provided in st_backup_stream by the
 > +  application.
 > +
 > +  The struct st_stream ties together the st_backup_stream
 > +  with the data required by the I/O functions provided by this
 > +  application. Note that 'st_backup_stream' must be the first part of
 > +  the helper struct.
 > +
 > +  'stream_pos' is used to describe which section/chunk of a backup
 > +  image we have read last.
 > +*/
 > +struct st_stream
 > +{
 > +  struct st_backup_stream       bupstrm;        /* Must be first in 
st_stream */
 > +  File                          fd;             /* File descriptor */
 > +  my_off_t                      pos;            /* Byte position in 
file */
 > +  my_off_t                      size;           /* File size */
 > +  const char                    *path;          /* File name */
 > +  const char                    *stream_pos;    /* Verbal stream 
position */

It seems a bit strange to me to use a string to record position.  Why
not an enum?


 > +  const char                    *zalgo;         /* Compression 
algorithm */
 > +#ifdef HAVE_COMPRESS
 > +  uchar                         *zbuf;          /* Buffer for 
compressed data */
 > +  z_stream                      zstream;        /* zlib helper struct */
 > +#endif
 > +};
 > +
 > +
 > +/**
 > +  Read from the stream/image.
 > +
 > +  @param[in,out]    strm        stream handle, updating position
 > +  @param[in,out]    data        data container, updating contents 
and ptrs

I do not think it is easy to understand from this comment what is
producer and comsumer here.  And what "updating contents and ptrs"
implies.

 > +  @param            envelope    not used

What is the intention of envelope?

 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK      ok
 > +    @retval     BSTREAM_EOS     end of stream
 > +    @retval     otherwise       error
 > +
 > +  @note The return value is specified as 'int' in stream_v1.h
 > +  though only values from enum_bstream_ret_codes are expected.
 > +*/
 > +
 > +static int
 > +str_read(struct st_stream *strm, struct st_blob *data,
 > +         struct st_blob envelope __attribute__((unused)))
 > +{
 > +  size_t        lgt;
 > +  int           rc= BSTREAM_OK;
 > +  DBUG_ENTER("str_read");
 > +  DBUG_ASSERT(strm && strm->path);
 > +  DBUG_ASSERT(data && data->begin && data->end);
 > +
 > +  /*
 > +    Compute wanted length.
 > +  */
 > +  lgt= BBL(data);
 > +  DBUG_PRINT("bupstrm", ("want bytes: %lu", (ulong) lgt));
 > +
 > +  /*
 > +    Read.
 > +  */

I suggest to put the compression related code in separate functions.
That would ease the reading of this code.

 > +#ifdef HAVE_COMPRESS
 > +  if (strm->zbuf)
 > +  {
 > +    int zerr;
 > +    z_stream *zstream= &strm->zstream;
 > +
 > +    /* Set output buffer pointer and size. */
 > +    zstream->next_out= data->begin;

Why not use BBS here?

 > +    /*
 > +      Zstream can process in one go a block whose size fits into uInt
 > +      type. If we have more space available in the buffer, we ignore the
 > +      extra bytes.
 > +    */
 > +    if (lgt > ~((uInt)0))
 > +      lgt= ~((uInt)0); /* purecov: inspected */
 > +    zstream->avail_out= (uInt) lgt;
 > +
 > +    lgt= 0; /* In case zbuf is empty and we are at EOF. */
 > +    do
 > +    {
 > +      /* If input buffer is empty, load data from compressed image. */
 > +      if (!zstream->avail_in)
 > +      {
 > +        zstream->avail_in= (uInt) my_read(strm->fd, strm->zbuf, 
ZBUF_SIZE,
 > +                                           MYF(0));
 > +        ERROR_INJECT("str_read_z_io_error",
 > +                     zstream->avail_in= (uInt) MY_FILE_ERROR;
 > +                     my_errno= EIO;);
 > +        ERROR_INJECT("str_read_z_eof",
 > +                     zstream->avail_in= 0;);
 > +        if (zstream->avail_in == (uInt) MY_FILE_ERROR)
 > +        {
 > +          errm("cannot read compressed image '%s': %s\n",
 > +               strm->path, strerror(my_errno));
 > +          rc= BSTREAM_ERROR;
 > +          goto end;
 > +        }
 > +        else if (!zstream->avail_in)
 > +        {
 > +          /* EOF. If first read, we leave with lgt == 0 here. */
 > +          break;
 > +        }
 > +        zstream->next_in= strm->zbuf;
 > +      }
 > +      /* Decompress, */
 > +      zerr= inflate(zstream, Z_NO_FLUSH);
 > +      ERROR_INJECT("str_read_zlib_error",
 > +                   zerr= Z_STREAM_ERROR;
 > +                   zstream->msg= (char*) "error injected";);
 > +      /* Set output length. */
 > +      lgt= zstream->next_out - data->begin;
 > +      if (zerr == Z_STREAM_END)
 > +      {
 > +        /* EOF. If first decompress, we leave with lgt == 0 here. */
 > +        break;
 > +      }
 > +      else if (zerr != Z_OK)
 > +      {
 > +        errm("cannot decompress image '%s': %d: %s\n",
 > +             strm->path, zerr, zstream->msg);
 > +        rc= BSTREAM_ERROR;
 > +        goto end;
 > +      }
 > +    } while (zstream->avail_out);
 > +    /* lgt contains the number of bytes decompressed. */
 > +  }
 > +  else
 > +#endif

...

 > +  {
 > +    lgt= my_read(strm->fd, data->begin, lgt, MYF(0));
 > +    ERROR_INJECT("str_read_io_error",
 > +                 lgt= MY_FILE_ERROR; my_errno= EIO;);
 > +    if (lgt == MY_FILE_ERROR)
 > +    {
 > +      errm("cannot read image '%s': %s\n", strm->path, 
strerror(my_errno));
 > +      rc= BSTREAM_ERROR;
 > +      goto end;
 > +    }
 > +  }
 > +  DBUG_PRINT("bupstrm", ("read bytes: %lu", (ulong) lgt));
 > +
 > +  /*
 > +    Check for end of stream.
 > +  */
 > +  if (lgt == 0)
 > +  {
 > +    rc= BSTREAM_EOS;
 > +    goto end;
 > +  }
 > +
 > +  /*
 > +    Update stream handle and data container.
 > +  */
 > +  strm->pos+= lgt;
 > +  data->begin+= lgt;
 > +
 > + end:
 > +  DBUG_RETURN(rc);
 > +}
 > +
 > +
 > +#ifdef NOTUSED
 > +/**
 > +  Skip part of the stream/image.
 > +
 > +  @param[in,out]    strm        stream handle, updating position
 > +  @param[in,out]    len         number of bytes to skip, skipped
 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK      ok
 > +    @retval     otherwise       error
 > +
 > +  @note The return value is specified as 'int' in stream_v1.h
 > +  though only values from enum_bstream_ret_codes are expected.
 > +*/
 > +
 > +static int
 > +str_forward(struct st_stream *strm, size_t *len)

Is there a plan for using this function?  Or can it be removed?

...

...

 > +
 > +static int
 > +str_open_rd(struct st_stream *strm, const char *path, uint *version_p)
 > +{
 > +  MY_STAT               stat_area;
 > +  size_t                lgt;
 > +  int                   rc= BSTREAM_OK;
 > +  int                   errpos= 0;
 > +  uint16                version;
 > +  const unsigned char   backup_magic_bytes[8]=
 > +    {
 > +      0xE0, // ###.....
 > +      0xF8, // #####...
 > +      0x7F, // .#######
 > +      0x7E, // .######.
 > +      0x7E, // .######.
 > +      0x5F, // .#.#####
 > +      0x0F, // ....####
 > +      0x03  // ......##
 > +    };

I suggest that these magic bytes are defined a place where the same
definition could be used for both server and client.

 > +#ifdef HAVE_COMPRESS
 > +  const unsigned char   gzip_magic_bytes[3]=
 > +    {
 > +      0x1f,
 > +      0x8b,
 > +      0x08
 > +    };
 > +#endif
 > +  uchar  prefix[sizeof(backup_magic_bytes) + sizeof(version)];
 > +  DBUG_ENTER("str_open_rd");
 > +  DBUG_ASSERT(strm);
 > +  DBUG_ASSERT(path);
 > +  DBUG_ASSERT(version_p);
 > +
 > +  /*
 > +    Initialize stream struct.
 > +  */
 > +  bzero(strm, sizeof(*strm));
 > +
 > +  /*
 > +    Set image path name.
 > +  */
 > +  strm->path= path;
 > +  DBUG_PRINT("bupstrm", ("opening image file: '%s'", strm->path));
 > +
 > +  /*
 > +    Open the image file.
 > +  */
 > +  strm->fd= my_open(strm->path, O_RDONLY, MYF(0));
 > +  if (strm->fd < 0)
 > +  {
 > +    errm("cannot open backup image '%s': %s\n", strm->path, 
strerror(my_errno));

Why generate your own error here?  Could you not let my_open generate
the error?

 > +    goto err;
 > +  }
 > +  errpos= 10;
 > +  DBUG_PRINT("bupstrm", ("opened  image file: '%s'  fd: %d",
 > +                         strm->path, strm->fd));
 > +
 > +  ERROR_INJECT("str_open_rd_stat_error", my_close(strm->fd, MYF(0)););
 > +  if (my_fstat(strm->fd, &stat_area, MYF(0)))
 > +  {
 > +    errm("cannot fstat open backup image '%s': %s\n",
 > +         strm->path, strerror(my_errno= errno));
 > +    goto err;
 > +  }
 > +  strm->size= stat_area.st_size;
 > +
 > +  /*
 > +    Read prefix with magic number and image version.
 > +  */
 > +  lgt= my_read(strm->fd, prefix, sizeof(prefix), MYF(0));
 > +  ERROR_INJECT("str_open_rd_read_error", lgt= MY_FILE_ERROR; 
my_errno= EIO;);
 > +  ERROR_INJECT("str_open_rd_read_length", lgt= sizeof(prefix) - 1;);
 > +  if (lgt != sizeof(prefix))
 > +  {
 > +    if (lgt == MY_FILE_ERROR)
 > +      errm("cannot read image '%s': %s\n", strm->path, 
strerror(my_errno));
 > +    else
 > +      errm("image '%s' has only %lu bytes of at least %lu required\n",
 > +           strm->path, (ulong) lgt, (ulong) sizeof(prefix));
 > +    goto err;
 > +  }
 > +  DBUG_PRINT("bupstrm", ("read magic number and version"));
 > +

Here, too, I suggest to put compress code in separate method.

 > +#ifdef HAVE_COMPRESS
 > +  /*
 > +    Check for compression.
 > +  */
 > +  if (!memcmp(prefix, gzip_magic_bytes, sizeof(gzip_magic_bytes)))
 > +  {
 > +    int                 zerr;
 > +    bstream_blob        blob;
 > +
 > +    ERROR_INJECT("str_open_rd_zbuf_malloc", my_malloc_error_inject= 1;);
 > +    strm->zbuf= (uchar*) my_malloc(ZBUF_SIZE, MYF(MY_WME));
 > +    if (!strm->zbuf)
 > +    {
 > +      /* Error message reported by mysys. */
 > +      goto err;
 > +    }
 > +    errpos= 20;
 > +    strm->zstream.zalloc= 0;
 > +    strm->zstream.zfree= 0;
 > +    strm->zstream.opaque= 0;
 > +    strm->zstream.msg= 0;
 > +    strm->zstream.next_in= strm->zbuf;
 > +    strm->zstream.avail_in= 10;
 > +    memcpy(strm->zbuf, prefix, 10);
 > +
 > +    zerr= inflateInit2(&strm->zstream, MAX_WBITS + 16);
 > +    ERROR_INJECT("str_open_rd_zlib_init",
 > +                 zerr= Z_STREAM_ERROR;
 > +                 strm->zstream.msg= (char*) "error injected";);
 > +    if (zerr != Z_OK)
 > +    {
 > +      errm("cannot init zlib on image '%s': %d: %s\n",
 > +           strm->path, zerr, strm->zstream.msg);
 > +      goto err;
 > +    }
 > +    errpos= 30;
 > +
 > +    blob.begin= prefix;
 > +    blob.end= prefix + sizeof(prefix);
 > +    zerr= str_read(strm, &blob, blob);
 > +    if (zerr == BSTREAM_EOS)
 > +    {
 > +      errm("end of stream within header on image '%s'\n", strm->path);
 > +      goto err;
 > +    }
 > +    if (zerr != BSTREAM_OK || (blob.begin != blob.end))
 > +    {
 > +      /* Error message reported by function. */
 > +      goto err;
 > +    }
 > +    strm->zalgo= "gzip";
 > +  }
 > +  else
 > +#endif

...

 > +    strm->zalgo= "none";
 > +
 > +  /*
 > +    Check magic number and image version.
 > +  */
 > +  if (memcmp(prefix, backup_magic_bytes, sizeof(backup_magic_bytes)))
 > +  {
 > +    errm("not a backup image file: '%s'. Magic number mismatch.\n", 
strm->path);
 > +    goto err;
 > +  }
 > +  ERROR_INJECT("str_open_rd_version", prefix[8]= '\0';);
 > +  version= uint2korr(prefix+8);
 > +  if (version != 1)
 > +  {
 > +    errm("backup image version %u is not supported\n", version);
 > +    goto err;
 > +  }
 > +  *version_p= version;
 > +
 > +  /*
 > +    Set callback functions in struct st_backup_stream.
 > +  */
 > +  strm->bupstrm.stream.write=   (as_write_m) NULL;
 > +  strm->bupstrm.stream.read=    (as_read_m) str_read;
 > +  strm->bupstrm.stream.forward= (as_forward_m) NULL; /* str_forward; */
 > +
 > +  /*
 > +    Open the stream.
 > +  */
 > +  DBUG_PRINT("bupstrm", ("opening stream: '%s'  fd: %d", strm->path, 
strm->fd));
 > +  rc= bstream_open_rd(&strm->bupstrm, sizeof(prefix));
 > +  if (rc != BSTREAM_OK)
 > +  {
 > +    errm("cannot open stream library on '%s'.\n", strm->path);

library?

 > +    goto err;
 > +  }
 > +  DBUG_PRINT("bupstrm", ("opened  stream: '%s'  fd: %d  rc: %d",
 > +                         strm->path, strm->fd, rc));
 > +
 > +  goto end;

Why not just return here?

...

 > +int
 > +bcat_add_item(struct st_bstream_image_header *hdr,
 > +              struct st_bstream_item_info *item)
 > +{
 > +  struct st_backup_catalog      *bup_catalog= (struct 
st_backup_catalog*) hdr;
 > +  enum enum_bstream_ret_codes   brc= BSTREAM_OK;
 > +  int                           err;
 > +  DBUG_ENTER("bcat_add_item");
 > +  DBUG_ASSERT(hdr);
 > +  DBUG_ASSERT(item);
 > +  DBUG_PRINT("bupstrm", ("adding item: 0x%lx  pos: %lu  type: %d 
name: '%.*s'",
 > +                         (long) item, item->pos, item->type,
 > +                         BBLS(&item->name)));
 > +
 > +  switch (item->type) {
 > +
 > +  case BSTREAM_IT_CHARSET:
 > +  case BSTREAM_IT_USER:
 > +  case BSTREAM_IT_TABLESPACE:
 > +  {
 > +    struct st_backup_global     *bup_global;
 > +    DYNAMIC_ARRAY               *array;
 > +
 > +    /* Allocate memory for the item. */
 > +    ERROR_INJECT("bcat_add_item_global_malloc", 
my_malloc_error_inject= 1;);
 > +    bup_global= (struct st_backup_global*)
 > +      my_malloc(sizeof(struct st_backup_global), MYF(MY_WME));
 > +    if (!bup_global)
 > +    {
 > +      /* Error message reported by mysys. */
 > +      brc= BSTREAM_ERROR;
 > +      goto end;
 > +    }
 > +    /* Copy the item struct into the allocated struct. */
 > +    bup_global->glb_item= *item;
 > +    /* Initialize other elements. */
 > +    bzero(&bup_global->glb_metadata, sizeof(bup_global->glb_metadata));

The above allocation is very similar for case of this switch
statement.  I suggest to parameterize it in a helper function.

...

 > +/**
 > +  Create global iterator of a given type.
 > +
 > +  Possible iterator types.
 > +
 > +  - BSTREAM_IT_CHARSET:    all charsets
 > +  - BSTREAM_IT_USER:       all users
 > +  - BSTREAM_IT_TABLESPACE: all tablespaces
 > +  - BSTREAM_IT_DB:         all databases
 > +
 > +  The following types of iterators iterate only over items for which
 > +  some meta-data should be saved in the image.
 > +
 > +  - BSTREAM_IT_GLOBAL:     all global items in create-dependency order
 > +  - BSTREAM_IT_PERDB:      all per-db items except tables which are 
enumerated
 > +                           by a table iterator (see below)
 > +  - BSTREAM_IT_PERTABLE:   all per-table items in create-dependency 
orders.
 > +

I do not quite understand this comment.  Are the three latter no
possible iterator types since they are in another list?

 > +  @param[in]    hdr             catalog reference
 > +  @param[in]    it_type         iterator type
 > +
 > +  @return       pointer to the iterator
 > +    @retval     NULL            error
 > +*/
 > +
 > +void*
 > +bcat_iterator_get(struct st_bstream_image_header *hdr, unsigned int 
it_type)
 > +{
 > +  struct st_backup_catalog      *bup_catalog= (struct 
st_backup_catalog*) hdr;
 > +  struct st_backup_iterator     *iter;
 > +  DBUG_ENTER("bcat_iterator_get");
 > +  DBUG_ASSERT(hdr);
 > +  DBUG_PRINT("bupstrm", ("getting iterator for type: %u", it_type));
 > +
 > +  ERROR_INJECT("bcat_iterator_get_malloc", my_malloc_error_inject= 1;);
 > +  iter= my_malloc(sizeof(struct st_backup_iterator), MYF(MY_WME));
 > +  if (!iter)
 > +  {
 > +    /* Error message reported by mysys. */
 > +    goto end;
 > +  }
 > +  iter->it_index= -1;
 > +  iter->it_type= it_type;
 > +
 > +  ERROR_INJECT("bcat_iterator_get_type", it_type= BSTREAM_IT_PERTABLE;);
 > +  switch (it_type) {
 > +
 > +    /* purecov: begin inspected */
 > +  case BSTREAM_IT_CHARSET:
 > +    iter->it_array= &bup_catalog->cat_charsets;
 > +    DBUG_PRINT("bupstrm", ("charset"));
 > +    break;
 > +
 > +  case BSTREAM_IT_USER:
 > +    iter->it_array= &bup_catalog->cat_users;
 > +    DBUG_PRINT("bupstrm", ("user"));
 > +    goto err;
 > +    /* purecov: end */

Is it not possible to test the above, yet?

 > +
 > +  case BSTREAM_IT_DB:
 > +    iter->it_array= &bup_catalog->cat_databases;
 > +    DBUG_PRINT("bupstrm", ("database"));
 > +    break;
 > +
 > +    /* purecov: begin inspected */
 > +  case BSTREAM_IT_TABLESPACE:
 > +    iter->it_array= &bup_catalog->cat_tablespaces;
 > +    DBUG_PRINT("bupstrm", ("tablespace"));
 > +    goto err;
 > +
 > +  case BSTREAM_IT_GLOBAL:
 > +    errm("iterator type not yet implemented: global\n");
 > +    DBUG_PRINT("bupstrm", ("NOTYET implemented: global"));
 > +    goto err;
 > +
 > +  case BSTREAM_IT_PERDB:
 > +    errm("iterator type not yet implemented: perdb\n");
 > +    DBUG_PRINT("bupstrm", ("NOTYET implemented: perdb"));
 > +    goto err;
 > +    /* purecov: end */

Ditto.

...

 > +
 > +/**
 > +  Create database object from its meta-data.
 > +
 > +  When the meta-data section of backup image is read, items can be 
created
 > +  as their meta-data is read (so that there is no need to store these
 > +  meta-data). This functions stores them in the catalog instead of
 > +  creating database objects. So the application can make different use
 > +  of the data.

I did not quite understand this comment.  Are you saying that items
CAN be created as the meta-data is read, but that this is not the way
it is done?

 > +
 > +  @param[in]    hdr             catalog reference
 > +  @param[in]    item            item reference
 > +  @param[in]    query           query string
 > +  @param[in]    data            data blob
 > +
 > +  @note The item has set the 'type' element only. No item name nor
 > +  a catalog position is provided. Let alone a reference to a database.
 > +
 > +  @note Either query or data or both can be empty, depending
 > +  on what was stored in the image.
 > +
 > +  @note The blob provided by query and/or data is not guaranteed to
 > +  exist after the call. It must be copied to become part of the catalog.
 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK      ok
 > +    @retval     otherwise       error
 > +
 > +  @note The return value is specified as 'int' in stream_v1.h
 > +  though only values from enum_bstream_ret_codes are expected.
 > +*/
 > +
 > +int
 > +bcat_create_item(struct st_bstream_image_header *hdr,
 > +                 struct st_bstream_item_info *item,
 > +                 struct st_blob query,
 > +                 struct st_blob data)
 > +{
 > +  struct st_backup_catalog      *bup_catalog= (struct 
st_backup_catalog*) hdr;
 > +  struct st_backup_metadata     mdata;
 > +  enum enum_bstream_ret_codes   brc= BSTREAM_OK;
 > +  DBUG_ENTER("bcat_create_item");
 > +  DBUG_ASSERT(hdr);
 > +  DBUG_ASSERT(item);
 > +  DBUG_PRINT("bupstrm", ("item: 0x%lx  pos: %lu  type: %d  name: 
'%.*s'",
 > +                         (long) item, item->pos, item->type,
 > +                         BBLS(&item->name)));
 > +  DBUG_PRINT("bupstrm", ("query: '%.*s'", BBLS(&query)));
 > +  DBUG_PRINT("bupstrm", ("data length: %lu", BBL(&data)));
 > +
 > +  /*
 > +    Create new st_blob structs. The strings from the
 > +    stream libraray have a short life time.

Spelling: "libraray"

...

 > +  /* Header, binlog goup file. */

Spelling: "goup"

...

 > +void*
 > +backup_image_open(const char *filename, struct st_backup_catalog 
*bup_catalog)
 > +{
 > +  struct st_stream      *strm;
 > +  int                   rc;
 > +  int                   errpos= 0;
 > +  uint                  idx;
 > +  uint                  version;
 > +  DBUG_ENTER("backup_image_open");
 > +  DBUG_ASSERT(filename);
 > +  DBUG_ASSERT(bup_catalog);
 > +
 > +  /* Allocate low-level stream info struct. */
 > +  ERROR_INJECT("backup_image_open_malloc", my_malloc_error_inject= 1;);
 > +  strm= my_malloc(sizeof(struct st_stream), MYF(MY_WME));
 > +  if (!strm)
 > +  {
 > +    /* Error message reported by mysys. */
 > +    goto err;
 > +  }
 > +  errpos= 10;
 > +
 > +  /*
 > +    Open stream.
 > +  */
 > +  rc= str_open_rd(strm, filename, &version);
 > +  DBUG_PRINT("bupstrm", ("Opened backup image file (rc=%d)", rc));
 > +  if (rc != BSTREAM_OK)
 > +  {
 > +    /* Error message reported by function. */
 > +    goto err;
 > +  }
 > +  strm->stream_pos= "prefix";
 > +  errpos= 20;
 > +
 > +  /* Add compression algorithm to header. */
 > +  bup_catalog->cat_zalgo= strm->zalgo;
 > +
 > +  /* Add image path to header. */
 > +  bup_catalog->cat_image_path= strm->path;
 > +
 > +  /* Add image size to header. */
 > +  bup_catalog->cat_image_size= strm->size;
 > +
 > +  /*
 > +    Read backup image stream header.
 > +  */
 > +  rc= bstream_rd_header(&strm->bupstrm, &bup_catalog->cat_header);
 > +  ERROR_INJECT("backup_image_open_header", rc= BSTREAM_ERROR;);
 > +  DBUG_PRINT("bupstrm", ("Read archive header (rc=%d)", rc));
 > +  if ((rc != BSTREAM_OK) && (rc != BSTREAM_EOC))
 > +  {
 > +    errm("error on stream library read of header.\n");
 > +    goto err;
 > +  }
 > +  strm->stream_pos= "header";
 > +
 > +  /* Add image version to header. */
 > +  bup_catalog->cat_header.version= version;
 > +  DBUG_PRINT("bupstrm", ("set header version: %u", version));
 > +
 > +  /*
 > +    Now that we know, how many snapshots we have, initialize the
 > +    snapshot pos to table index arrays.
 > +  */
 > +  DBUG_PRINT("bupstrm", ("allocating snapshot indexes: %u",
 > +                         bup_catalog->cat_header.snap_count));
 > +  for (idx= 0; idx < bup_catalog->cat_header.snap_count; idx++)
 > +  {
 > +    struct st_backup_snapshot *bup_snapshot;
 > +
 > +    /* Allocate memory for the item. */
 > +    ERROR_INJECT("backup_image_open_snapshot_malloc",
 > +                 my_malloc_error_inject= 1;);
 > +    bup_snapshot= (struct st_backup_snapshot*)
 > +      my_malloc(sizeof(struct st_backup_snapshot), MYF(MY_WME));
 > +    if (!bup_snapshot)
 > +    {
 > +      /* Error message reported by mysys. */
 > +      rc= BSTREAM_ERROR;
 > +      goto err;
 > +    }
 > +    /* Initialize a new index array. */
 > +    rc= my_init_dynamic_array(&bup_snapshot->snap_index_pos_to_table,
 > +                              sizeof(struct st_backup_table*),
 > +                              DYN_ALLOC_INIT, DYN_ALLOC_INCR);
 > +    ERROR_INJECT("backup_image_open_index", if (!rc) {rc= TRUE;
 > +        delete_dynamic(&bup_snapshot->snap_index_pos_to_table);});
 > +    ERROR_INJECT("backup_image_open_index1", if (idx && !rc) {rc= TRUE;
 > +        delete_dynamic(&bup_snapshot->snap_index_pos_to_table);});
 > +    if (rc)
 > +    {
 > +      /* Error message reported by mysys. */
 > +      my_free(bup_snapshot, MYF(0));
 > +      rc= BSTREAM_ERROR;
 > +      goto err;
 > +    }
 > +    /* Check consistency of array position. */
 > +    DBUG_ASSERT(idx == bup_catalog->cat_snapshots.elements);
 > +    /* Insert in the catalog. */
 > +    ERROR_INJECT("backup_image_open_snapshot_insert",
 > +                 my_malloc_error_inject= 1;);
 > +    rc= insert_dynamic(&bup_catalog->cat_snapshots, (uchar*) 
&bup_snapshot);
 > +    if (rc)
 > +    {
 > +      /* Error message reported by mysys. */
 > +      delete_dynamic(&bup_snapshot->snap_index_pos_to_table);
 > +      my_free(bup_snapshot, MYF(0));
 > +      rc= BSTREAM_ERROR;
 > +      goto err;
 > +    }

The error handling is a bit inconsistent in this function.  For the
first part you set errpos a do the cleanup at the end.  For the latter
part you do clean-up before going to the end.  Why not move most of
the content of the above error handling to 'err:'?  Calling my_free
(providing proper initialization and allowing for null) and setting
rc, could be done unconditionally.  No reason to repeat that code for
each function call.

 > +  }
 > +
 > +  goto end;
 > +
 > + err:
 > +  switch (errpos) {
 > +  case 20:
 > +    (void) str_close(strm);
 > +  case 10:
 > +    my_free(strm, MYF(MY_ALLOW_ZERO_PTR));
 > +    strm= NULL;
 > +  }
 > +
 > + end:
 > +  DBUG_RETURN(strm);
 > +}
 > +
 > +
 > +/**
 > +  Close a backup image.
 > +
 > +  @param[in]    image_handle            image handle reference
 > +*/
 > +
 > +enum enum_bstream_ret_codes
 > +backup_image_close(void* image_handle)
 > +{
 > +  struct st_stream *strm= (struct st_stream*) image_handle;
 > +  enum enum_bstream_ret_codes brc;
 > +  DBUG_ENTER("backup_image_close");
 > +  DBUG_ASSERT(strm);
 > +
 > +  brc= str_close(strm);
 > +  my_free(strm, MYF(0));
 > +  DBUG_RETURN(brc);
 > +}
 > +
 > +
 > +/**
 > +  Read backup image catalog.
 > +
 > +  @param[in]    image_handle            image handle reference
 > +  @param[in]    bup_catalog             catalog reference
 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK              ok
 > +    @retval     != BSTREAM_OK           error
 > +*/
 > +
 > +enum enum_bstream_ret_codes
 > +backup_read_catalog(void* image_handle, struct st_backup_catalog 
*bup_catalog)
 > +{
 > +  struct st_stream *strm= (struct st_stream*) image_handle;
 > +  enum enum_bstream_ret_codes brc;
 > +  DBUG_ENTER("backup_read_catalog");
 > +  DBUG_ASSERT(strm);
 > +  DBUG_ASSERT(bup_catalog);

Would it not be appropriate here and in the other similar _read_
functions to assert that stream_pos is as expected?

 > +
 > +  /*
 > +    Read catalog requires search of next chunk start.
 > +  */
 > +  brc= bstream_next_chunk(&strm->bupstrm);
 > +  ERROR_INJECT("backup_read_catalog_next_eos", brc= BSTREAM_EOS;);
 > +  ERROR_INJECT("backup_read_catalog_next_err", brc= BSTREAM_ERROR;);
 > +  DBUG_PRINT("bupstrm", ("Find next chunk (brc=%d)", brc));
 > +  if (brc != BSTREAM_OK)
 > +  {
 > +    if (brc == BSTREAM_EOS)
 > +    {
 > +      errm("end of stream after %s.\n", strm->stream_pos);
 > +      goto end;
 > +    }
 > +    errm("cannot find catalog after %s.\n", strm->stream_pos);
 > +    goto end;
 > +  }
 > +
 > +  /*
 > +    Read catalog.
 > +  */
 > +  brc= bstream_rd_catalogue(&strm->bupstrm,
&bup_catalog->cat_header);
 > +  ERROR_INJECT("backup_read_catalog_rd_eos", brc= BSTREAM_EOS;);
 > +  ERROR_INJECT("backup_read_catalog_rd_err", brc= BSTREAM_ERROR;);
 > +  DBUG_PRINT("bupstrm", ("Read archive catalogue (brc=%d)", brc));
 > +  if ((brc != BSTREAM_OK) && (brc != BSTREAM_EOC))
 > +  {
 > +    if (brc == BSTREAM_EOS)
 > +    {
 > +      errm("end of stream within catalog.\n");
 > +      goto end;
 > +    }
 > +    errm("error on stream library read of catalog.\n");
 > +    goto end;
 > +  }
 > +  strm->stream_pos= "catalog";
 > +  brc= BSTREAM_OK;
 > +
 > + end:
 > +  DBUG_RETURN(brc);
 > +}
 > +
 > +
 > +/**
 > +  Read backup image meta data.
 > +
 > +  @param[in]    image_handle            image handle reference
 > +  @param[in]    bup_catalog             catalog reference
 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK              ok
 > +    @retval     != BSTREAM_OK           error
 > +*/
 > +
 > +enum enum_bstream_ret_codes
 > +backup_read_metadata(void *image_handle, struct st_backup_catalog 
*bup_catalog)
 > +{
 > +  struct st_stream *strm= (struct st_stream*) image_handle;
 > +  enum enum_bstream_ret_codes brc;
 > +  DBUG_ENTER("backup_read_metadata");
 > +  DBUG_ASSERT(strm);
 > +  DBUG_ASSERT(bup_catalog);
 > +
 > +  /*
 > +    Read meta data requires search of next chunk start.
 > +  */
 > +  brc= bstream_next_chunk(&strm->bupstrm);
 > +  ERROR_INJECT("backup_read_metadata_next_eos", brc= BSTREAM_EOS;);
 > +  ERROR_INJECT("backup_read_metadata_next_err", brc= BSTREAM_ERROR;);
 > +  DBUG_PRINT("bupstrm", ("Find next chunk (brc=%d)", brc));
 > +  if (brc != BSTREAM_OK)
 > +  {
 > +    if (brc == BSTREAM_EOS)
 > +    {
 > +      errm("end of stream after %s.\n", strm->stream_pos);
 > +      goto end;
 > +    }
 > +    errm("cannot find meta data after %s.\n", strm->stream_pos);
 > +    goto end;
 > +  }
 > +
 > +  /*
 > +    Read meta data.
 > +  */
 > +  brc= bstream_rd_meta_data(&strm->bupstrm,
&bup_catalog->cat_header);
 > +  ERROR_INJECT("backup_read_metadata_rd_eos", brc= BSTREAM_EOS;);
 > +  ERROR_INJECT("backup_read_metadata_rd_err", brc= BSTREAM_ERROR;);
 > +  DBUG_PRINT("bupstrm", ("Read meta data (brc=%d)", brc));
 > +  if ((brc != BSTREAM_OK) && (brc != BSTREAM_EOC))
 > +  {
 > +    if (brc == BSTREAM_EOS)
 > +    {
 > +      errm("end of stream within meta data.\n");
 > +      goto end;
 > +    }
 > +    errm("error on stream library read of meta data.\n");
 > +    goto end;
 > +  }
 > +  strm->stream_pos= "meta data";
 > +  brc= BSTREAM_OK;
 > +
 > + end:
 > +  DBUG_RETURN(brc);
 > +}
 > +
 > +
 > +/**
 > +  Read backup image table data.
 > +
 > +  @param[in]    image_handle            image handle reference
 > +  @param[in]    bup_catalog             catalog reference
 > +
 > +  @return       status
 > +    @retval     BSTREAM_OK              ok
 > +    @retval     != BSTREAM_OK           error
 > +*/
 > +
 > +enum enum_bstream_ret_codes
 > +backup_read_snapshot(void *image_handle,
 > +                     struct st_backup_catalog *bup_catalog
 > +                      __attribute__((unused)),
 > +                     struct st_bstream_data_chunk *snapshot)
 > +{
 > +  struct st_stream *strm= (struct st_stream*) image_handle;
 > +  enum enum_bstream_ret_codes brc;
 > +  DBUG_ENTER("backup_read_snapshot");
 > +  DBUG_ASSERT(strm);
 > +  DBUG_ASSERT(snapshot);
 > +
 > +  /*
 > +    First read data chunk requires search of next chunk start.
 > +  */

I do not quite understand this comment.  I think you mean that before
reading the first data chunk, you must search for it while this is not
necessary for the succeeding chunks.

...

 > +
 > +
 > +/*
 > +  ===================
 > +  Catalog navigation.
 > +  ===================
 > +*/
 > +
 > +/*
 > +  Catalog items are located by "catalog coordinates". The format of the
 > +  catalog coordinates depends on the type of item. It is specified as
 > +  follows:
 > +
 > +  [item position (global)]= [db no.]
 > +  [item position (table)]= [ snap no. ! pos in snapshot's table list ]
 > +  [item position (other per-db item)]= [ pos in db item list ! db no. ]
 > +  [item position (per-table item)] = [ pos in table's item list !
 > +                                       db no. ! table pos ]

What does '!' mean.  If it is just a separator, please, choose
another character so that people do not believe it has something to do
with negation.

...

 > +
 > +/**
 > +  Locate a perdb object by catalog coordinates.
 > +
 > +  Catalog coordinates for perdb items are:
 > +
 > +      db_pos        database position in catalog
 > +      pos           perdb item position in database
 > +
 > +  @param[in]    bup_catalog     catalog reference
 > +  @param[in]    db_pos          position in catalog's databse array

Spelling: "databse"

...

 > === added file 'client/backup_stream.h'
 > --- a/client/backup_stream.h	1970-01-01 00:00:00 +0000
 > +++ b/client/backup_stream.h	2008-11-25 19:33:34 +0000

...
 > +/*
 > +  =================================
 > +  Convenience functions and macros.
 > +  =================================
 > +*/
 > +
 > +/* Backup blob length. */
 > +ulong BBL(struct st_blob *blob);
 > +
 > +/* Backup blob string. */
 > +char *BBS(struct st_blob *blob);
 > +
 > +/*
 > +  Backup blob length and string.
 > +  This is for use with printf format '%.*s'.
 > +  This must be a macro because it produces two comma separated values!
 > +*/
 > +#define BBLS(_b_)       ((uint) BBL(_b_)), BBS(_b_)
 > +
 > +/* The above shall also be usable by C++. */
 > +C_MODE_END

I am not sure I like this mix of macros and functions spread across
several files.  Would it not be better to define everything in the
same place?  More self-explanatory names would also help.

That's how far I have got for now.  More to come ...

--
Øystein
Thread
bzr commit into mysql-6.0-backup branch (ingo.struewing:2745) WL#4630Ingo Struewing25 Nov
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen11 Dec
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing12 Dec
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla19 Dec
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing10 Feb 2009
          • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla10 Feb 2009
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla22 Dec
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing12 Feb 2009
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla17 Feb 2009
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing17 Feb 2009
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen17 Feb 2009
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing18 Feb 2009
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen19 Feb 2009
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing21 Feb 2009