List:Commits« Previous MessageNext Message »
From:Rafal Somla Date:December 22 2008 1:14pm
Subject:Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)
WL#4630
View as plain text  
Hi Ingo!

I'm half way through your impressively thoroughly written patch! I have reviewed 
the code part and will continue with tests. Here I send comments which I have so 
far.

I use format proposed by Chuck some time ago. The main idea is that all comments 
are divided into requests and suggestions. Only requests are bounding and must 
be resolved somehow (we can discuss how) before I can accept the patch.

As for suggestions, you don't have to apply them or even argue against them. 
They are things which came to my mind when reading the code and I include them 
here in hope that sometimes they might be useful. If you are inspired by some of 
them you can change the code. But if you are not, feel free to ignore them. By 
putting them as suggestions I confirm that the patch is correct without applying 
them.

The numbers in squares refer to comments inlined in your patch.

REQUESTS
========
1. Remove TODO from comment in str_open_rd() - compressed images are handled by 
the current code [3].
2. Improve descriptions of some functions [8,19]
3. Fix invalid errpos value when jumping to err: [9]
4. The doxygen documentation of backup image format contains errors. There is 
BUG#40587 about fixing it which I'm working on but are not finished yet. In the 
meantime, could you fix an incorrect quote from this documentation [14].
5. Explain if it is safe (against future code changes) to use fixed constant 
1024 for client option codes [17].
6. Fix wrong return type (or specification) of search_object() [20].
7. Update code and comment about missing binlog coordinates - the issue has been 
fixed now [24].


SUGGESTIONS
===========

1. Make BBL and friends macros, not functions [1].
2. Consider not committing code which is not used [2].
3. Use explicit prefix length (10 bytes) [4].
4. Change struct initialization [5].
5. Safer handling of variable holding return code [6].
6. Comment on memory allocated inside stream library [7,13].
7. Fill cat_header->version member as soon as version is known [10].
8. Use explicit types for function parameters (not void*) [11,18].
9. Instead of doing bstream_next_chunk() in backup_read_snapshot(), do it in 
functions reading preceding sections of the image [12].
10. Use assertions to verify assumptions you make (you do it in most cases but 
few cases are not covered) [15,26,29].
11. In function backup_locate_global() you make consistency checks for the 
position stored inside catalogue item. I think everything should work ok even if 
the position is wrong. Therefore I suggest to not error here but just warn and 
continue [16].
12. Avoid repetition of similar code in search_object() [21,22].
13. Change code layout violating our guidelines [23].
14. Try to simplify if codition [25].
15. Print options in hex format [27].
16. When listing tables belonging to a snapshot, use snap_index_pos_to_table 
array [28].
17. List object metadata in the order in which it is stored in the image [29].


COMMENTS
========
- I think it is a shame that you can not use the libbackupstream library 
directly, but you symlink the sources and compile them again. I think this 
should be fixed but not as part of this WL. Therefore I reported Bug#41668 about 
the issue. Perhaps you could add some comments to the Makefiles to inform about 
the issue and suggested solution.

- Current documentation of backup image format (which is a work in progress) 
specifies that the only global objects for which metadata can be stored in a 
backup image are tablespaces and databases. Your code is written under that 
assumption. However, I think this is unnecessary restriction of the image format 
and I will discuss removing it. If it happens, then your client would have to 
handle metadata for other global objects too. But this change, if needed, could 
perhaps be done as a separate WL.


DETAILS
=======
>  2745 Ingo Struewing	2008-11-25
>       WL#4630 - ST: MySQL Backup Client Program - Milestone 1
>       
>       The MySQL Backup Client Program is a command line client
>       program, that examines and displays the contents of a
>       backup image.
>       
>       By command line options you can select different types
>       of information to view.
>       
>       It does also have a search capability for finding
>       specific database objects in the backup image.
>       
>       For a detailed explanation of the options, run
>       
>           mysqlbackup --help
> added:
>   client/backup_stream.c
>   client/backup_stream.h
>   client/mysqlbackup.cc
>   mysql-test/suite/backup/r/backup_client.result
>   mysql-test/suite/backup/r/backup_client_binlog.result
>   mysql-test/suite/backup/r/backup_client_coverage.result
>   mysql-test/suite/backup/t/backup_client.test
>   mysql-test/suite/backup/t/backup_client_binlog.test
>   mysql-test/suite/backup/t/backup_client_coverage.test
> modified:
>   .bzrignore
>   client/CMakeLists.txt
>   client/Makefile.am
>   include/my_sys.h
>   mysql-test/mysql-test-run.pl
>   mysys/my_malloc.c
>   mysys/my_static.c
>   mysys/safemalloc.c
> 
> per-file messages:
>   .bzrignore
>     WL#4534 - Backup client program
>     Added mysqlbackup related files.
>   client/CMakeLists.txt
>     WL#4534 - Backup client program
>     Added mysqlbackup client build requirements.
>   client/Makefile.am
>     WL#4534 - Backup client program
>     Added mysqlbackup client build requirements.
>   client/backup_stream.c
>     WL#4534 - Backup client program
>     Backup stream reader.
>   client/backup_stream.h
>     WL#4534 - Backup client program
>     Backup stream reader declarations.
>   client/mysqlbackup.cc
>     WL#4534 - Backup client program
>     The new mysqlbackup program.
>   include/my_sys.h
>     WL#4534 - Backup client program
>     Added support for my_malloc() error injection.
>   mysql-test/mysql-test-run.pl
>     WL#4534 - Backup client program
>     Added mysqlbackup client to the list of environment variables.
>   mysql-test/suite/backup/r/backup_client.result
>     WL#4534 - Backup client program
>     Test result.
>   mysql-test/suite/backup/r/backup_client_binlog.result
>     WL#4534 - Backup client program
>     Test result.
>   mysql-test/suite/backup/r/backup_client_coverage.result
>     WL#4534 - Backup client program
>     Test result.
>   mysql-test/suite/backup/t/backup_client.test
>     WL#4534 - Backup client program
>     New test case.
>   mysql-test/suite/backup/t/backup_client_binlog.test
>     WL#4534 - Backup client program
>     New test case.
>   mysql-test/suite/backup/t/backup_client_coverage.test
>     WL#4534 - Backup client program
>     New test case.
>   mysys/my_malloc.c
>     WL#4534 - Backup client program
>     Added support for my_malloc() error injection.
>   mysys/my_static.c
>     WL#4534 - Backup client program
>     Added support for my_malloc() error injection.
>   mysys/safemalloc.c
>     WL#4534 - Backup client program
>     Added support for my_malloc() error injection.
> === modified file '.bzrignore'
> --- a/.bzrignore	2008-10-29 14:20:12 +0000
> +++ b/.bzrignore	2008-11-25 19:33:34 +0000
> @@ -314,6 +314,10 @@ client/mysql_upgrade
>  client/mysqladmin
>  client/mysqladmin.c
>  client/mysqladmin.cpp
> +client/mysqlbackup
> +client/stream_v1.h
> +client/stream_v1_services.h
> +mysql-test/mysqlbackup-test.bak
>  client/mysqlbinlog
>  client/mysqlbinlog.cpp
>  client/mysqlcheck
> 
...
> === added file 'client/backup_stream.c'
> --- a/client/backup_stream.c	1970-01-01 00:00:00 +0000
> +++ b/client/backup_stream.c	2008-11-25 19:33:34 +0000
...
> +/*
> +  ====================================================
> +  Convenience functions for access to st_blob objects.
> +  ====================================================
> +*/
> +
> +/**
> +  Backup blob length.
> +
> +  This is a function, not a macro, for better type checking.
> +
> +  @param[in]    blob            blob
> +
> +  @return       length
> +*/
> +
> +ulong BBL(struct st_blob *blob)
> +{
> +  DBUG_ASSERT(blob);
> +  DBUG_ASSERT(blob->end >= blob->begin);
> +  return ((ulong) (blob->end - blob->begin));
> +}
> +

[1] Suggestion: Use a macro here (perhaps only in production build).

...
> +
> +
> +#ifdef NOTUSED

[2] Hmm, should it be commited in?

> +/**
> +  Skip part of the stream/image.
> +
> +  @param[in,out]    strm        stream handle, updating position
> +  @param[in,out]    len         number of bytes to skip, skipped
> +
> +  @return       status
> +    @retval     BSTREAM_OK      ok
> +    @retval     otherwise       error
> +
> +  @note The return value is specified as 'int' in stream_v1.h
> +  though only values from enum_bstream_ret_codes are expected.
> +*/
> +
> +static int
> +str_forward(struct st_stream *strm, size_t *len)
> +{
...
> +
> +
> +/**
> +  Open the stream/image for reading.
> +
> +  A backup stream has a prefix, consisting of 8 bytes magic number
> +  and 2 bytes version number.
> +
> +  TODO A compressed image has a gzip magic number.

[3] This is done already, or?

> +  TODO An encrypted image has yet another magic number.
> +
> +  @param[in,out]    strm        stream handle, updating path, fd, bupstrm
> +  @param[in]        path        image path name
> +  @param[out]       version_p   detected image version
> +
> +  @return       status
> +    @retval     BSTREAM_OK      ok
> +    @retval     otherwise       error
> +
> +  @note The return value is specified as 'int' in stream_v1.h
> +  though only values from enum_bstream_ret_codes are expected.
> +*/
> +
> +static int
> +str_open_rd(struct st_stream *strm, const char *path, uint *version_p)
> +{
> +  MY_STAT               stat_area;
> +  size_t                lgt;
> +  int                   rc= BSTREAM_OK;
> +  int                   errpos= 0;
> +  uint16                version;
> +  const unsigned char   backup_magic_bytes[8]=
> +    {
> +      0xE0, // ###.....
> +      0xF8, // #####...
> +      0x7F, // .#######
> +      0x7E, // .######.
> +      0x7E, // .######.
> +      0x5F, // .#.#####
> +      0x0F, // ....####
> +      0x03  // ......##
> +    };
> +#ifdef HAVE_COMPRESS
> +  const unsigned char   gzip_magic_bytes[3]=
> +    {
> +      0x1f,
> +      0x8b,
> +      0x08
> +    };
> +#endif
> +  uchar  prefix[sizeof(backup_magic_bytes) + sizeof(version)];

[4] I think it is better not to use sizeof() here. The prefix must be 10 bytes 
long - this is specified by backup stream format. Why to depend on compiler 
having expected basic type sizes if we know the exact number. However, it is ok 
to leave the code as is.

> +  DBUG_ENTER("str_open_rd");
> +  DBUG_ASSERT(strm);
> +  DBUG_ASSERT(path);
> +  DBUG_ASSERT(version_p);
...
> +  switch (item->type) {
> +
> +  case BSTREAM_IT_CHARSET:
> +  case BSTREAM_IT_USER:
> +  case BSTREAM_IT_TABLESPACE:
> +  {
> +    struct st_backup_global     *bup_global;
> +    DYNAMIC_ARRAY               *array;
> +
> +    /* Allocate memory for the item. */
> +    ERROR_INJECT("bcat_add_item_global_malloc", my_malloc_error_inject= 1;);
> +    bup_global= (struct st_backup_global*)
> +      my_malloc(sizeof(struct st_backup_global), MYF(MY_WME));
> +    if (!bup_global)
> +    {
> +      /* Error message reported by mysys. */
> +      brc= BSTREAM_ERROR;
> +      goto end;
> +    }
> +    /* Copy the item struct into the allocated struct. */
> +    bup_global->glb_item= *item;
> +    /* Initialize other elements. */
> +    bzero(&bup_global->glb_metadata, sizeof(bup_global->glb_metadata));

[5] Perhaps:

   bzero(bup_global, sizeof(*bup_global));
   bup->global->glb_item= *item;

so that when structure layout changes it will still be properly initialized.

...
> +int
> +bcat_create_item(struct st_bstream_image_header *hdr,
> +                 struct st_bstream_item_info *item,
> +                 struct st_blob query,
> +                 struct st_blob data)
> +{
> +  struct st_backup_catalog      *bup_catalog= (struct st_backup_catalog*) hdr;
> +  struct st_backup_metadata     mdata;
> +  enum enum_bstream_ret_codes   brc= BSTREAM_OK;

[6] Suggestion: change initial value of brc to BSTREAM_ERROR and set it to 
BSTREAM_OK when you know that all is ok. This can save few lines of code and 
make it safer to change the code in the future (correct de-allocation at end: 
label).

...

[7] Perhaps add a comment explaining that the code below frees memory allocated 
by backup stream library.

> +  /* Header, snapshots. */
> +  DBUG_PRINT("bupstrm", ("freeing header, snapshots."));
> +  for (idx= 0; idx < hdr->snap_count; ++idx)
> +    bstream_free(hdr->snapshot[idx].engine.name.begin);
> +
> +  /* Header, binlog goup file. */
> +  DBUG_PRINT("bupstrm", ("freeing header, binlog goup file."));
> +  if (hdr->binlog_group.file)
> +    bstream_free(hdr->binlog_group.file); /* purecov: inspected */
> +
> +  /* Header, binlog file. */
> +  DBUG_PRINT("bupstrm", ("freeing header, binlog file."));
> +  if (hdr->binlog_pos.file)
> +    bstream_free(hdr->binlog_pos.file);
> +
> +  /* Header, server version. */
> +  DBUG_PRINT("bupstrm", ("freeing header, server version."));
> +  if (hdr->server_version.extra.begin)
> +    bstream_free(hdr->server_version.extra.begin);
> +
> +  /* Catalog. */
> +  DBUG_PRINT("bupstrm", ("freeing catalog."));
> +  my_free(bup_catalog, MYF(0));
> +
> +  DBUG_VOID_RETURN;
> +}
> +
> +
> +/**
> +  Open a backup image.

[8] For reading, right?

> +
> +  @param[in]    filename                file name
> +  @param[in]    bup_catalog             catalog reference
> +
> +  @return       image handle reference
> +    @retval     NULL                    error
> +*/
> +
> +void*
> +backup_image_open(const char *filename, struct st_backup_catalog *bup_catalog)
> +{
> +  struct st_stream      *strm;
> +  int                   rc;
> +  int                   errpos= 0;
> +  uint                  idx;
> +  uint                  version;
> +  DBUG_ENTER("backup_image_open");
> +  DBUG_ASSERT(filename);
> +  DBUG_ASSERT(bup_catalog);
> +
> +  /* Allocate low-level stream info struct. */
> +  ERROR_INJECT("backup_image_open_malloc", my_malloc_error_inject= 1;);
> +  strm= my_malloc(sizeof(struct st_stream), MYF(MY_WME));
> +  if (!strm)
> +  {
> +    /* Error message reported by mysys. */
> +    goto err;

[9] Here you jump to err when errpos is 0, but the swich doesn't cover this case
(probably adding a default case is  what is needed).

> +  }
> +  errpos= 10;
> +
> +  /*
> +    Open stream.
> +  */
> +  rc= str_open_rd(strm, filename, &version);
> +  DBUG_PRINT("bupstrm", ("Opened backup image file (rc=%d)", rc));
> +  if (rc != BSTREAM_OK)
> +  {
> +    /* Error message reported by function. */
> +    goto err;
> +  }
> +  strm->stream_pos= "prefix";
> +  errpos= 20;
> +
> +  /* Add compression algorithm to header. */
> +  bup_catalog->cat_zalgo= strm->zalgo;
> +
> +  /* Add image path to header. */
> +  bup_catalog->cat_image_path= strm->path;
> +
> +  /* Add image size to header. */
> +  bup_catalog->cat_image_size= strm->size;
> +
> +  /*
> +    Read backup image stream header.
> +  */
> +  rc= bstream_rd_header(&strm->bupstrm, &bup_catalog->cat_header);
> +  ERROR_INJECT("backup_image_open_header", rc= BSTREAM_ERROR;);
> +  DBUG_PRINT("bupstrm", ("Read archive header (rc=%d)", rc));
> +  if ((rc != BSTREAM_OK) && (rc != BSTREAM_EOC))
> +  {
> +    errm("error on stream library read of header.\n");
> +    goto err;
> +  }
> +  strm->stream_pos= "header";
> +
> +  /* Add image version to header. */
> +  bup_catalog->cat_header.version= version;
> +  DBUG_PRINT("bupstrm", ("set header version: %u", version));

[10] Suggestion: I'd move the above assignment before the call to 
bastream_rd_header(), just after the version has been detected in str_open_rd().

...
> +/**
> +  Read backup image catalog.
> +
> +  @param[in]    image_handle            image handle reference
> +  @param[in]    bup_catalog             catalog reference
> +
> +  @return       status
> +    @retval     BSTREAM_OK              ok
> +    @retval     != BSTREAM_OK           error
> +*/
> +
> +enum enum_bstream_ret_codes
> +backup_read_catalog(void* image_handle, struct st_backup_catalog *bup_catalog)

[11] Suggestion: declare image_handle as st_stream* and then cast it to void* if
needed (but should not be needed I think).

> +{
> +  struct st_stream *strm= (struct st_stream*) image_handle;
> +  enum enum_bstream_ret_codes brc;
> +  DBUG_ENTER("backup_read_catalog");
> +  DBUG_ASSERT(strm);
> +  DBUG_ASSERT(bup_catalog);
> +
...
> +enum enum_bstream_ret_codes
> +backup_read_snapshot(void *image_handle,
> +                     struct st_backup_catalog *bup_catalog
> +                      __attribute__((unused)),
> +                     struct st_bstream_data_chunk *snapshot)
> +{
> +  struct st_stream *strm= (struct st_stream*) image_handle;
> +  enum enum_bstream_ret_codes brc;
> +  DBUG_ENTER("backup_read_snapshot");
> +  DBUG_ASSERT(strm);
> +  DBUG_ASSERT(snapshot);
> +
> +  /*
> +    First read data chunk requires search of next chunk start.
> +  */

[12] Suggestion: move this to the end of backup_read_metadata(). So that this
function can be written under pre-condition that the stream is already
positioned at a start of next chunk.


> +  if (!strcmp(strm->stream_pos, "meta data"))
> +  {
> +    brc= bstream_next_chunk(&strm->bupstrm);
> +    ERROR_INJECT("backup_read_snapshot_next_eos", brc= BSTREAM_EOS;);
> +    ERROR_INJECT("backup_read_snapshot_next_err", brc= BSTREAM_ERROR;);
> +    DBUG_PRINT("bupstrm", ("Find next chunk (brc=%d)", brc));
> +    if (brc != BSTREAM_OK)
> +    {
> +      if (brc == BSTREAM_EOS)
> +      {
> +        errm("end of stream after %s.\n", strm->stream_pos);
> +        goto end;
> +      }
> +      errm("cannot find next chunk after %s.\n", strm->stream_pos);
> +      goto end;
> +    }
> +  }
> +
> +  /*
> +    Read data chunk.
> +  */

[13] Perhaps add a note explaining that memory for storing the data is allocated 
and freed by backup stream library so that future readers of this code will not 
ask the question which I asked myself: "where the memory for storing the data 
comes from and what happens to it when it is not needed any more?".

> +  snapshot->data.begin= NULL;
> +  snapshot->data.end= NULL;
> +  brc= bstream_rd_data_chunk(&strm->bupstrm, snapshot);
...
> +/*
> +  ===================
> +  Catalog navigation.
> +  ===================
> +*/
> +
> +/*
> +  Catalog items are located by "catalog coordinates". The format of the
> +  catalog coordinates depends on the type of item. It is specified as
> +  follows:
> +
> +  [item position (global)]= [db no.]
> +  [item position (table)]= [ snap no. ! pos in snapshot's table list ]

[14] I think it is the other way around - first position of the table in the
snapshot, then the snapshot number.

> +  [item position (other per-db item)]= [ pos in db item list ! db no. ]
> +  [item position (per-table item)] = [ pos in table's item list !
> +                                       db no. ! table pos ]
> +*/
> +
> +/**
> +  Locate a global object by catalog coordinates.
> +
> +  Catalog coordinates for global objects are:
> +
> +      array         object-type specific array in catalog
> +      pos           position in array
> +
> +  @param[in]    typnam          object type name
> +  @param[in]    array           array reference
> +  @param[in]    pos             position in array
> +
> +  @return       object reference
> +*/
> +
> +struct st_backup_global*
> +backup_locate_global(const char *typnam, DYNAMIC_ARRAY *array, ulong pos)
> +{
> +  struct st_backup_global *bup_global;
> +  DBUG_ENTER("backup_locate_global");
> +  DBUG_ASSERT(typnam);
> +  DBUG_ASSERT(array);
> +  DBUG_PRINT("bupstrm", ("typename: %s  pos: %lu", typnam, pos));
> +
> +  /* Check plausibility of array position. */
> +  ERROR_INJECT("backup_locate_global_pos", pos= array->elements;);
> +  if (pos >= array->elements)
> +  {
> +    errm("non-existent %s position: %lu in catalog.\n", typnam, pos);
> +    bup_global= NULL;
> +    goto end;
> +  }
> +
> +  /*
> +    Get a reference to the global object.
> +    It does not fail because we checked that pos is in range.
> +    Note that the array contains pointers only.
> +  */
> +  bup_global= *((struct st_backup_global**) dynamic_array_ptr(array, pos));

[15] Suggestion: Here and in other places, check your assumption with assert:

   DBUG_ASSERT(bup_global);

> +  DBUG_PRINT("bupstrm",
> +             ("located %s: '%.*s'  item: 0x%lx  pos: %lu", typnam,
> +              BBLS(&bup_global->glb_item.name), (long) bup_global,
> +              bup_global->glb_item.pos));
> +
> +  /* Check consistency of array position. */
> +  if (bup_global->glb_item.pos != pos)
> +  {
> +    /* purecov: begin inspected */
> +    errm("position of stored %s item: %lu  "
> +         "does not match position in catalog array: %lu\n",
> +         typnam, bup_global->glb_item.pos, pos);

[16] Consider: warn here and continue.

> +    bup_global= NULL;
> +    /* purecov: end */
> +  }
> +
> + end:
> +  DBUG_RETURN(bup_global);
> +}
...
> === added file 'client/backup_stream.h'
> --- a/client/backup_stream.h	1970-01-01 00:00:00 +0000
> +++ b/client/backup_stream.h	2008-11-25 19:33:34 +0000
...
> +static const char       *opt_mysqlbackup_search;
> +enum options_mysqlbackup
> +{
> +  OPT_MYSQLBACKUP_CATALOG_SUMMARY=1024,

[17] Why 1024? What does it have to allign with?

> +  OPT_MYSQLBACKUP_CATALOG_DETAILS,
> +  OPT_MYSQLBACKUP_METADATA_STATEMENTS,
> +  OPT_MYSQLBACKUP_METADATA_EXTRA,
> +  OPT_MYSQLBACKUP_SNAPSHOTS,
> +  OPT_MYSQLBACKUP_DATA_CHUNKS,
> +  OPT_MYSQLBACKUP_DATA_TOTALS,
> +  OPT_MYSQLBACKUP_SUMMARY,
> +  OPT_MYSQLBACKUP_ALL,
> +  OPT_MYSQLBACKUP_EXACT,
> +  OPT_MYSQLBACKUP_SEARCH,
> +
> +  /* End of options terminator. */
> +  OPT_MYSQLBACKUP_LAST
> +};
> +/**
...
> +  Print item information.
> +
> +  @parameter[in]        indent          indentation spaces
> +  @parameter[in]        item_arg        reference to some item
> +  @parameter[in]        what            flags describing what to print
> +                                        PRI_NAME        item name
> +                                        PRI_META        meta data
> +*/
> +
> +void print_item(uint indent, void *item_arg, uint what)

[18] Suggestion: declare item_arg as st_bstream_item_info* type.

...
> +/**
> +  Search backup object.

[19] ... In the given catalog. What is the result of the search and how is it
communicated to the caller?

> +
> +  @param[in]    search_name             search name
> +  @param[out]   search_database_name    database name, may be NULL
> +  @param[out]   search_object_name      object name
> +
> +  @return       status
> +    @retval     0                       ok
> +    @retval     != 0                    error
> +*/
> +
> +void

[20] Bad return type for a function which is supposed to return status...

> +search_object(struct st_backup_catalog *bup_catalog,
> +              char *search_database_name,
> +              char *search_object_name)
> +{
> +  struct st_backup_global       *bup_global;
> +  struct st_backup_database     *bup_database;
> +  struct st_blob                *name;
> +  uint                          search_database_name_len;
> +  uint                          search_object_name_len;
> +  uint                          idx;
> +  DBUG_ENTER("search_object");
> +  DBUG_ASSERT(bup_catalog);
> +  /* search_database_name may be NULL. */
> +  DBUG_ASSERT(search_object_name);
> +
> +  printf("\n");
> +  if (search_database_name)
> +    printf("Searching '%s'.'%s'\n", search_database_name, search_object_name);
> +  else
> +    printf("Searching '%s'\n", search_object_name);
> +
> +  if (search_database_name)
> +    search_database_name_len= strlen(search_database_name);
> +  search_object_name_len= strlen(search_object_name);
> +
> +  if (!search_database_name)
> +  {
> +    DYNAMIC_ARRAY **search_array_p;
> +    DYNAMIC_ARRAY *search_array_array[]= {
> +      &bup_catalog->cat_charsets,
> +      &bup_catalog->cat_users,
> +      &bup_catalog->cat_tablespaces,
> +      NULL
> +    };
> +
> +    for (search_array_p= search_array_array; *search_array_p; search_array_p++)
> +    {
> +      for (idx= 0; idx < (*search_array_p)->elements; idx++)
> +      {
> +        /* Note that the array contains pointers only. */
> +        bup_global= *((struct st_backup_global**)
> +                      dynamic_array_ptr(*search_array_p, idx));
> +        name= &bup_global->glb_item.name;
> +        if (!my_wildcmp(&my_charset_latin1, (char*) name->begin,
> +                        (char*) name->end, (char*) search_object_name,
> +                        (char*) search_object_name + search_object_name_len,
> +                        '\\', '_', '%'))
> +        {
> +          print_item(2, bup_global, PRI_NAME | PRI_META);
> +        }
> +      }
> +    }

[21] Why a separate search loop for database? Looks like it could be handled by
search_array_array loop.

> +
> +    /* Search database. */
> +    for (idx= 0; idx < bup_catalog->cat_databases.elements; idx++)
> +    {
> +      /* Note that the array contains pointers only. */
> +      bup_database= *((struct st_backup_database**)
> +                      dynamic_array_ptr(&bup_catalog->cat_databases, idx));
> +      name= &bup_database->db_item.base.name;
> +      if (!my_wildcmp(&my_charset_latin1, (char*) name->begin,
> +                      (char*) name->end, (char*) search_object_name,
> +                      (char*) search_object_name + search_object_name_len,
> +                      '\\', '_', '%'))
> +      {
> +        print_item(2, bup_database, PRI_NAME | PRI_META);
> +      }
> +    }
> +    goto end;
> +  }
> +
> +  /* Search object within databases. */
> +  for (idx= 0; idx < bup_catalog->cat_databases.elements; idx++)
> +  {
> +    struct st_backup_database   *bup_database;
> +
> +    /* Note that the array contains pointers only. */
> +    bup_database= *((struct st_backup_database**)
> +                    dynamic_array_ptr(&bup_catalog->cat_databases, idx));
> +    name= &bup_database->db_item.base.name;
> +    if (!my_wildcmp(&my_charset_latin1, (char*) name->begin,
> +                    (char*) name->end, (char*) search_database_name,
> +                    (char*) search_database_name + search_database_name_len,
> +                    '\\', '_', '%'))
> +    {
> +      DYNAMIC_ARRAY **search_array_p;
> +      DYNAMIC_ARRAY *search_array_array[]= {
> +        &bup_database->db_tables,
> +        &bup_database->db_perdbs,
> +        NULL
> +      };
> +      struct st_backup_perdb  *bup_perdb;
> +      uint                    jdx;
> +
> +      for (search_array_p= search_array_array;
> +           *search_array_p;
> +           search_array_p++)
> +      {
> +        for (jdx= 0; jdx < (*search_array_p)->elements; jdx++)
> +        {
> +          /* Note that the array contains pointers only. */
> +          bup_perdb= *((struct st_backup_perdb**)
> +                       dynamic_array_ptr(*search_array_p, jdx));
> +          name= &bup_perdb->perdb_item.base.name;
> +          if (!my_wildcmp(&my_charset_latin1, (char*) name->begin,
> +                          (char*) name->end, (char*) search_object_name,
> +                          (char*) search_object_name + search_object_name_len,
> +                          '\\', '_', '%'))
> +          {
> +            print_item(4, bup_perdb, PRI_NAME | PRI_META);
> +          }
> +        }
> +      }
> +    }
> +  }

[22] This code repeats again - perhaps a helper function should be defined.

> +
> + end:
> +  DBUG_VOID_RETURN;
> +}
...
> +enum enum_bstream_ret_codes
> +read_and_print_summary(void *image_handle,
> +                       struct st_backup_catalog *bup_catalog)
> +{
> +  struct st_bstream_image_header        *hdr= &bup_catalog->cat_header;
> +  enum enum_bstream_ret_codes           brc;
> +  DBUG_ENTER("read_and_print_summary");
> +  DBUG_ASSERT(image_handle);
> +  DBUG_ASSERT(bup_catalog);
> +
> +  /*
> +    Read summary.
> +  */
> +  brc= backup_read_summary(image_handle, bup_catalog);
> +  if (brc != BSTREAM_OK)
> +  {
> +    goto end;
> +  }

[23] Our coding guidelines do not allow braces in such situation... :P

> +
> +  /*
> +    Print summary.
> +  */
> +  if (opt_mysqlbackup_summary)
> +  {
> +    printf("\n");
> +    printf("Summary:\n");
> +    printf("\n");
> +    printf("Creation time:       ");
> +    print_time(&hdr->start_time);
> +    printf("\n");
> +    printf("Validity time:       ");
> +    print_time(&hdr->vp_time);
> +    printf("\n");
> +    printf("Finish   time:       ");
> +    print_time(&hdr->end_time);
> +    printf("\n");
> +
> +    /*
> +      Binlog coordinates are present, if the BSTREAM_FLAG_BINLOG flag is
> +      set in the header *and* table data are in the image.
> +      DBUG_ASSERT(!(hdr->flags & BSTREAM_FLAG_BINLOG) ||
> +                  hdr->binlog_pos.file || hdr->binlog_group.file);
> +    */

[24] This was a bug which is fixed now. Now binlog coordinates are always 
present if BSTREAM_FLAG_BINLOG is set. The code should be changed to reflect this.

> +    if ((hdr->flags & BSTREAM_FLAG_BINLOG) &&
> +        (hdr->binlog_pos.file || hdr->binlog_group.file))
> +    {
> +      printf("Binlog coordinates:  %s:%lu\n",
> +             hdr->binlog_pos.file ? hdr->binlog_pos.file : "[NULL]",
> +             hdr->binlog_pos.pos);
> +      printf("Binlog group coords: %s:%lu\n",
> +             hdr->binlog_group.file ? hdr->binlog_group.file : "[NULL]",
> +             hdr->binlog_group.pos);
> +    }
> +    else
> +    {
> +      printf("No binlog information\n");
> +    }
> +  }
> +
> + end:
> +  DBUG_RETURN(brc);
> +}
> +
> +
> +/*
> +  =============
> +  Main program.
> +  =============
> +*/
> +
...
> +  /*
> +    Save image reading time if the rest is not wanted.
> +    Sorry for the spaghetti here. The reason is that it has beed
> +    decided not to display the first two character sets from the
> +    catalog with the other catalog items:
> +    The first two character sets are special:
> +    1. Character set to use for interpreting the backup file.
> +    2. Server character set.
> +    These do not count as catalog items. They are to display in
> +    the header section. To make the output somewhat user-friendly,
> +    we display them before the optional snapshots section.
> +    So if we have to display the snapshot section, but nothing
> +    from the catalog, we can skip catalog reading here.
> +  */
> +  if (opt_mysqlbackup_search ||
> +      opt_mysqlbackup_snapshots ||
> +      opt_mysqlbackup_catalog_summary ||
> +      opt_mysqlbackup_catalog_details ||
> +      opt_mysqlbackup_metadata_statements ||
> +      opt_mysqlbackup_metadata_extra ||
> +      opt_mysqlbackup_data_chunks ||
> +      opt_mysqlbackup_data_totals ||
> +      (opt_mysqlbackup_summary && !(hdr->flags &
> BSTREAM_FLAG_INLINE_SUMMARY)))

[25] Suggestion: wouldn't it be more clear to formulate condition when we *do 
not* want to read the catalog?

> +  {
> +    /*
> +      Read catalog.
> +      This can take some time. So flush what we have so far.
> +    */
> +    fflush(stdout);
> +    brc= backup_read_catalog(image_handle, bup_catalog);
> +    if (brc != BSTREAM_OK)
> +    {
> +      rc= 2;
> +      goto end;
> +    }
> +  }
> +
> +  /*
> +    The first two character sets are special:
> +    1. Character set to use for interpreting the backup file.
> +    2. Server character set.
> +    These do not count as catalog items.
> +    List them separately.
> +  */
> +  if (bup_catalog->cat_charsets.elements >= 2)
> +  {
> +    struct st_backup_global *bup_global;
> +
> +    /* Note that the array contains pointers only. */
> +    bup_global= *((struct st_backup_global**)
> +                  dynamic_array_ptr(&bup_catalog->cat_charsets, 1));
> +    printf("Server charset:      '%.*s'\n",
> BBLS(&bup_global->glb_item.name));
> +  }
> +  if (opt_verbose && (bup_catalog->cat_charsets.elements >= 1))
> +  {
> +    struct st_backup_global *bup_global;
> +
> +    /* Note that the array contains pointers only. */
> +    bup_global= *((struct st_backup_global**)
> +                  dynamic_array_ptr(&bup_catalog->cat_charsets, 0));
> +    printf("Backup image chrset: '%.*s'\n",
> BBLS(&bup_global->glb_item.name));
> +  }
> +
> +  if (opt_mysqlbackup_snapshots)
> +  {
> +    printf("Snapshot count:      %u\n", hdr->snap_count);
> +    printf("\n");
> +    printf("Snapshots:\n");
> +    printf("\n");
> +    for (idx= 0; idx < hdr->snap_count; idx++)

[26] Here and in other places - assert that snap_count < 256 which is the size 
of snapshot[] array (i.e., that you don't exceed array sizes).

> +    {
> +      struct st_bstream_snapshot_info     *snapshot= hdr->snapshot + idx;
> +      const char                          *snap_type;
> +
> +      switch (snapshot->type) {
> +      case BI_NATIVE:     snap_type= "native from";
> +        break;
> +      case BI_DEFAULT:    snap_type= "logical from locked tables";
> +        break;
> +      case BI_CS:         snap_type= "logical from consistent snapshot";
> +        break;
> +      /* purecov: begin inspected */
> +      case BI_NODATA:     snap_type= "nodata";
> +        break;
> +      default:            snap_type= "unknown/illegal";
> +        break;
> +      /* purecov: end */
> +      }
> +      if (snapshot->type == BI_NATIVE)
> +        printf("  Snapshot %u type    %s '%.*s'  version %u.%u\n",
> +               idx, snap_type,
> +               BBLS(&snapshot->engine.name),
> +               snapshot->engine.major, snapshot->engine.minor);
> +      else
> +        printf("  Snapshot %u type    %s\n", idx, snap_type);
> +      printf("  Snapshot %u version %u\n", idx, snapshot->version);
> +      if (snapshot->options)
> +        printf("  Snapshot %u options %u\n",
> +               idx, snapshot->options); /* purecov: inspected */

[27] I think the options should be printed as hex number (so that user can see 
the bits).

> +      printf("  Snapshot %u tables  %lu\n", idx, snapshot->table_count);
> +      /*
> +        List tables included in this snapshot.
> +        If there is a table, there must also be a database.
> +      */
> +      DBUG_ASSERT(!snapshot->table_count ||
> +                  bup_catalog->cat_databases.elements);
> +      if (snapshot->table_count)
> +      {
> +        uint kdx;
> +

[28] Suggestion: Below you iterate over databases and tables in each database to 
collect tables belonging to a given snapshot. But I think it would be simpler 
and more direct to iterate over snap_index_pos_to_table array of the given snapshot.

> +        for (kdx= 0; kdx < bup_catalog->cat_databases.elements; kdx++)
> +        {
> +          struct st_backup_database   *bup_database;
> +          struct st_blob *db_name;
> +
> +          /* Note that the array contains pointers only. */
> +          bup_database= *((struct st_backup_database**)
> +                          dynamic_array_ptr(&bup_catalog->cat_databases,
> kdx));
> +          db_name= &bup_database->db_item.base.name;
> +          if (bup_database->db_tables.elements)
> +          {
> +            struct st_backup_table  *bup_table;
> +            uint                    jdx;
> +
> +            for (jdx= 0; jdx < bup_database->db_tables.elements; jdx++)
> +            {
> +              struct st_blob *name;
> +
> +              /* Note that the array contains pointers only. */
> +              bup_table= *((struct st_backup_table**)
> +                           dynamic_array_ptr(&bup_database->db_tables,
> jdx));
> +              if (bup_table->tbl_item.snap_num == idx)
> +              {
> +                name= &bup_table->tbl_item.base.base.name;
> +                printf("  Snapshot %u table   '%.*s'.'%.*s'\n",
> +                       idx, BBLS(db_name), BBLS(name));
> +              }
> +            }
> +          }
> +        }
> +      }
> +    }
> +  }
...
> +    /* Charsets. */
> +    if (opt_verbose)
> +    {
> +      /*
> +        The first two character sets are special:
> +        1. Character set to use for interpreting the backup file.
> +        2. Server character set.
> +        These do not count as catalog items.
> +      */
> +      for (idx= 2; idx < bup_catalog->cat_charsets.elements; idx++)
> +      {
> +        /* purecov: begin inspected */
> +        /* Note that the array contains pointers only. */
> +        bup_global= *((struct st_backup_global**)
> +                      dynamic_array_ptr(&bup_catalog->cat_charsets, idx));

[29] Suggestion: Check with assertion that bup_global is not NULL.

> +        print_item(2, bup_global, PRI_NAME);
> +        /* purecov: end */
> +      }
> +    }
...
> +  /*
> +    Print meta data.
> +  */

[29] Suggestion: print metadata info as it is stored in the image.
Here you show metadata in a logical order but I think it would add certain value 
to show it in the order in which it is stored in the image. This would give a 
closer insight into the structure of the image and the purpose of this client is 
to show this structure. Note that the order in which metadata is stored is 
important because it is supposed to obey dependencies between objects. Being 
able to see the actual order will help to diagnose potential problems.

However, I do realize that presenting this information as I suggest would 
require big changes to the structure of this code, so I leave it for your 
judgement if it is needed and worth the effort.

In the backup image, metadata for various objects from the catalogue is stored 
in following order:

- metadata for global items;
- for each database, metadata for tables from that database;
- metadata for all the other per-database objects.

Implementing my suggestion would require printing metadata from 
bcat_create_item() callback function which is called when backup catalogue is 
read by bstream_rd_catalogue().

> +  if (opt_mysqlbackup_metadata_statements || opt_mysqlbackup_metadata_extra)
> +  {
> +    printf("\n");
> +    printf("Meta data:\n");
> +
> +    /* Charsets don't have meta data. */
> +    /* Users don't have meta data. */

[30] Considering changing image specification to allow metadata for all types of 
objects in which case the code should handle it.

...
> === added file 'mysql-test/suite/backup/r/backup_client.result'
> === added file 'mysql-test/suite/backup/r/backup_client_binlog.result'
> === added file 'mysql-test/suite/backup/r/backup_client_coverage.result'
> === added file 'mysql-test/suite/backup/t/backup_client.test'
> === added file 'mysql-test/suite/backup/t/backup_client_binlog.test'
> === added file 'mysql-test/suite/backup/t/backup_client_coverage.test'

I have not reviewed these yet. Will send a separate email with my comments.

Rafal
Thread
bzr commit into mysql-6.0-backup branch (ingo.struewing:2745) WL#4630Ingo Struewing25 Nov
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen11 Dec
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing12 Dec
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla19 Dec
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing10 Feb 2009
          • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla10 Feb 2009
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla22 Dec
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing12 Feb 2009
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Rafal Somla17 Feb 2009
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing17 Feb 2009
  • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen17 Feb 2009
    • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing18 Feb 2009
      • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Øystein Grøvlen19 Feb 2009
        • Re: bzr commit into mysql-6.0-backup branch (ingo.struewing:2745)WL#4630Ingo Strüwing21 Feb 2009