List:Commits« Previous MessageNext Message »
From:Ingo Strüwing Date:November 21 2007 9:21pm
Subject:Review II of MyISAM backup [Re: bk commit into 5.2 tree (guilhem:1.2611)]
View as plain text  
Hi Guilhem,

Thank you for adopting some of my ideas!

I do now agree with the design of the patch.

Please see below for some small issues. Most of them are comments about
comments.

Guilhem Bichot, 06.11.2007 20:56:
...
> ChangeSet@stripped, 2007-11-06 20:56:05+01:00, guilhem@stripped +59 -0
>   WL#866 - Online backup driver for MyISAM.
...
>   include/my_global.h@stripped, 2007-11-06 20:55:59+01:00, guilhem@stripped +5 -3
>     When testing a C++ file against "gcc -Wold-style-cast", these casts
>     surfaced. static_cast_C_or_CPP means "a static cast which works in
>     C and in C++, without warnings".

Or "... works in C and C PreProcessor". ;-)
Seriously. Why a long name for such a commonly needed macro?
IMHO, CAST(type) would do the trick. Upper case suggest it's a macro.
Hence one can guess it's for compatibility as many macros are. I have
never seen a dynamic_cast in MySQL, so I would also guess it's static.

...
>   sql/sql_bitmap.h@stripped, 2007-11-06 20:56:00+01:00, guilhem@stripped +2 -2
>     When mysqld starts I get:
>     ==25833== Conditional jump or move depends on uninitialised value(s)
>     ==25833==    at 0x8185A2C: Bitmap<64>::is_set(unsigned) const
> (sql_bitmap.h:189)
...
>     this difference does not look normal; I add here an initialization in
>     Bitmap<64>'s default constructor, which removes the problem. This
>     problem is specific of 5.2 as ha_myisam::keys_with_parts appeared in 5.2.

As a reviewer I am thankful for the reasoning, but a file comment like
this should not be in the final push.

IMHO a file comment should say _what_ was changed. The reason, _why_ the
new code is required, is better explained in a code comment. A plain
initialization doesn't require much reasoning.

There are more comments affected.

...
>   storage/myisam/mi_log.c@stripped, 2007-11-06 20:56:01+01:00, guilhem@stripped
> +768 -80
...
>     If _myisam_log_command() uses small numbers (file descriptor,
>     file offset buffer length) it stores them in few bytes; if they are big
>     it stores them in more bytes; how many bytes are used is denoted
>     by the highest bit of the 'command' (MI_LOG_BIG_NUMBERS).

This is something we could tune. And if we do, we must do it before the
first push.

You have:
  +  if (info->dfile >= UINT_MAX16 || filepos >= UINT_MAX32 ||
  +      length >= UINT_MAX16)
  +  {
  +    buff[0]= ((uchar) command) | MI_LOG_BIG_NUMBERS;

I think dfile >= UINT_MAX16 applies quite seldom. But how do we estimate
the probability of the other two conditions? Do they have about the same
probability? It is just a feeling, but I feel that 'filepos' might be
above 4G much more often that reclen > 64K. Our default data pointer
size is 6 bytes, because the former default of 4 bytes was too small way
too often. When writes happen to a table, I believe it will go to the
highest positions with higher probability.

OTOH we do not have much time to tune this value well. But a quick poll
of knowlageable people might be better than nothing.

Just to exclude any doubts what I mean: I want to tune the number of
bytes to write for 'filepos' and/or 'length' in a "small" log header to
keep the avarage header length as small as possible.

My vote is 5 bytes for filepos.

...
>   storage/myisam/myisamdef.h@stripped, 2007-11-06 20:56:01+01:00, guilhem@stripped
> +111 -18
...
>     It is read and set with atomic operations: this provides the needed
>     synchronization between the table-writer thread (if the table is being
>     written now) (which is the reader of "physical_logging") and the thread
>     turning physical logging on or off (which is the writer of
>     "physical_logging").

This is very good information. It's a pity that the code comment
contains just a small summary of it. Perhaps I can learn something
again. How do you proceed when trying to understand foreign code? Do you
search for the changeset that introduced it and hope it's explained
there? And what helps you most when you brood about a complicated merge
of foreign code?

What I did up to now was to hope for good code comments and, if
available, read worklog specifications (what the community cannot always
do).

During merges I am always thankful when I find comments like "I added",
"I removed", ... Excessive explanations in file comments don't help much
in this situation.

When searching for a changeset relating to certain places in a file, a
few buzzwords like function or variable names are sufficient too.

In summary, from my way of work it would be better to have terse
comments in the file comments and exhaustive comments in the code. But
perhaps there is a better way of work?

...
> -typedef int (*IO_CACHE_CALLBACK)(struct st_io_cache*);
> +typedef int (*IO_CACHE_CALLBACK)(struct st_io_cache *, const uchar *, uint,
> +                                 my_off_t);

From the patch that extends this type, I can guess what the new
parameters are for. But if I would look at the io_cache independently
from the patch some time later, I probably would have a a hard time to
guess the same.

Can you please give parameter names and perhaps a comment?

...
> +      int ret= my_pwrite(info->file, Buffer, Count, pos,
> +                         info->myflags | MY_NABP);
> +      if (unlikely(ret))
> +        set_hard_write_error(info);

Just a hint. At this place I remember that I often had to add comments
at such places after doing coverage testing. Do you have coverage
testing on your todo list?

...
>        if (!append_cache)
>        {
> +        /*
> +          This post_write is really POST-write; callers depend on this! So
> +          always call it after writing to the file, not before.
> +        */

The comment is correct. However you call post_write() at other places
too, without commenting it there. I suggest either to comment the same
there too, or wipe it out completely. The name "post_write" is clear
enough, I think. Or did you want to say something that I missed (the
word "really" makes me suspect I could have missed something here).

...
> -/*
> -  These functions handle keyblock cacheing for ISAM and MyISAM tables.
...
> -  as such gives the maximum number of in-use blocks at any time.
> +/**
> +   @file
> +   These functions handle keyblock cacheing for ISAM and MyISAM tables.
...
> +   as such gives the maximum number of in-use blocks at any time.
> +*/

You changed the indentation from 2 to 3. This violates our coding
guidelines. This does also apply to many function comments, but I don't
repeat this complaint ans more.

...
> +/**
> +   Does a my_pwrite() to the file and then calls callback. Arguments are those
...
> +     @retval !=0    write or callback failed
> +*/
> +static inline int key_cache_pwrite(int Filedes, const uchar *Buffer,

The coding guidelines say that there shall be an empty line between
function comment and its definition.

...
> +  int ret, physical_logging= mi_get_physical_logging_state(share);
> +  DBUG_ASSERT(!physical_logging ||
> +              my_seek(share->kfile, 0L, MY_SEEK_END, MYF(0)) == new_length);
> +  ret= my_chsize(share->kfile, new_length, 0, MYF(0));
> +  if (physical_logging && mi_log_index_pages_physical)
> +    myisam_log_chsize_kfile_physical(share, new_length);

Shouldn't you have an atomic read here? Doesn't storing the value in a
local variable invalidate any attempt to assure that you can never read
the wrong value at the right moment?

...
> +   Function to display and apply a MyISAM logical or physical log to tables.
> +
> +   Prints what is in a MyISAM (logical or physical) log, optionally applies
> +   the changes to tables (all tables or only a set specified on the command
> +   line). Is used both by the standalone program myisamlog and by the restore
> +   code of the MyISAM online backup driver.

Er? The command line? The function reads it from 'param', I think. For
the myisamlog program it comes from the commandline. But for the server?

...
> +    command=(uint) head_ptr[0];
> +    command-= (big_numbers= (command & MI_LOG_BIG_NUMBERS));
> +    if (my_b_read(&cache, head, head_len[command][big_numbers] - 1))

Did you test with big records? I would assume that 'big_numbers' becomes
128 from the expression "command & MI_LOG_BIG_NUMBERS". And you subtract
this number from 'command' which is even more an evidence for this fact.

I suggest to use head_len[command][big_numbers ? 1 : 0]. At all other
places, the exact value of 'big_numbers' doesn't matter.

...
> +   Both logs:
> +   - are idempotent

Hm. You are using this term quite frequently. Unfortunately I don't
exactly get the meaning. A vague guess might be that applying a log
multiple times reproduces the same table contents again. But there might
be more to it.

...
> +/** the log_type global variable is probably legacy, it's always 0 now */

I would use the word 'obsolete' here, but I'm not native English either.

>  static int log_type=0;

...
> +#ifndef HAVE_MYISAM_PHYSICAL_LOGGING
> +  DBUG_ASSERT(0);
> +  DBUG_RETURN(1);
> +#endif

Wouldn't it be better to put the rest of the function in the #else
branch? It might avoid a compiler warning about unused code.

> +
> +  /* starting/stopping are complex operations so split in functions */
> +  switch (action)
>    {
> -    error=my_close(myisam_log_file,MYF(0)) ? my_errno : 0 ;
> -    myisam_log_file= -1;
> +  case MI_LOG_ACTION_OPEN:
> +    error= mi_log_start_physical(log_filename, tables);
> +    break;
> +  case MI_LOG_ACTION_CLOSE_CONSISTENT:
> +  case MI_LOG_ACTION_CLOSE_INCONSISTENT:
> +    error= mi_log_stop_physical(action);
> +    break;
> +  default:
> +    DBUG_ASSERT(0);
>    }
>    DBUG_RETURN(error);
>  }

...
> +    log         = &myisam_logical_log;
> +  else
> +  {
> +    DBUG_ASSERT(type == MI_LOG_PHYSICAL);
> +    log         = &myisam_physical_log;

Leftspace at assignment operator, twice.

...
> +  if (likely(my_b_inited(log) != NULL))
> +  {
> +    if (!logical)
> +    {
...
> +    }
> +    /*
> +      Don't be fooled: physical logging does not work with --external-locking
> +      (the physical log is private to the one mysqld). The my_lock() is
> +      relevant only for the logical log.
> +    */

... and thus we could place an 'else' here? Save the overhead of a
function call which isn't needed anyway?

> +    error=my_lock(log->file,F_WRLCK,0L,F_TO_EOF,MYF(MY_SEEK_NOT_DONE));

...
> +    VOID(my_b_write(log, buff, bufflen));
> +    if (buffert)
> +      VOID(my_b_write(log, buffert, length));

Sigh. If 'buff' would be called 'header', it would have saved me ~ 10
seconds. ;-) But it's from old code, so feel free to ignore this sigh.

...
> +   All writes (my_write, my_pwrite, memcpy to mmap'ed area, my_chsize) to the
> +   data or index file are done this way:
> +   @code
> +   {
> +     write_to_data_or_index_file;
> +     if ((atomic read of MYISAM_SHARE::physical_logging)==1)

It is just a comment. But wouldn't != 0 be more correct anyway?

> +       write log record to physical log;

...
> +  if (action == MI_LOG_ACTION_CLOSE_CONSISTENT)
> +  {
> +    for (list_item= myisam_open_list; list_item; list_item= list_item->next)
> +    {
> +      MI_INFO *info= (MI_INFO*)list_item->data;
> +      MYISAM_SHARE *share= info->s;
> +      /*
> +        Setting of the variable below always happens under THR_LOCK_myisam,
> +        which we have here, so we don't need atomic operations to read here.
> +      */
> +      if (!info->s->physical_logging)

You could use 'share' here.

> +        continue;
> +      /*
> +        Must take intern_lock, at least because key cache isn't safe if two
> +        calls to flush_key_blocks_int() run concurrently on the same file.
> +      */
...
> +      pthread_mutex_unlock(&share->intern_lock);
> +    } /* ... for (list_item=...) */
> +  } /* ... if (action == MI_LOG_ACTION_CLOSE_CONSISTENT) */

This loop is correct and necessary IMHO. We should not change the patch
here for the first version. But fo a later optimization we might
consider to repeat the same loop before taking THR_LOCK_myisam. This
could flush most of the data without blocking all tables. The above loop
could then run faster and thus shorten the big lock. Of course this
depends on how we estimate the "traffic" on the tables during the backup.

...
>        if (info->opt_flag & READ_CACHE_USED)
>        {
> +        /* QQ Why do we flush a READ_CACHE? it's a no-op */

I don't know either.

>  	if (flush_io_cache(&info->rec_cache))

...
> +   Here is how the MyISAM online backup works.
> +   It is online because we dirtily copy the data and index files,
> +   and the tables maintain a physical idempotent log of changes done to them
> +   during the copy process, applying this log to the dirty copy yields a clean
> +   table corresponding to how the original table was when logging ended.
> +
> +   A condition for this to work is that any update done to a table after the
> +   copy process started must be present in the log. So if an update is
> +   proceeding, we must wait for it to go to the log before starting the copy
> +   process.

Tha last sentence doesn't make much sense to me. Things that happen to
the table "before starting the copy process" do _not_ need to be in the
log if I understand correctly.

Only after finishing with an update we must check if the copy process
started already. Then it could be that not all of the update is in the
copy. Hence we must log it then.

Or did you mean "if an update is  proceeding, we must wait for it to go
to the log before ending the backup process" ?

...
> +retry:
> +  pthread_mutex_lock(&THR_LOCK_myisam);
> +  if (lock_state != LOCK_ERROR) // thread not already dead, kill it
> +  {
> +    /*
> +      If the locking thread had no time to create THD (very unlikely), wait
> +      for it.
> +    */
> +    if (unlikely(!lock_thd))
> +    {
> +      pthread_mutex_unlock(&THR_LOCK_myisam);
> +      sleep(1);
> +      goto retry;

So either the thread must be created or lock_state become LOCK_ERROR. If
there is any chance that both do not happen, then it could be a last
resort to escape the loop with thd->killed.

...
> +#if 0
> +    /*
> +      Of course driver shouldn't pollute thd's state like this, it's just for
> +      debugging now.
> +    */
> +    thd->proc_info="MyISAM dump going to start";
> +#endif

Could you please replace '0' by something?

...
> +/**
> +   Launches a separate thread ("locking thread") which will lock
> +   tables. Locking in a separate thread is needed to have a non-blocking
> +   prelock() (given that thr_lock() is blocking).

Here I would appreciate to see a brief explanation why we do not need to
wait here or a reference to another place like in your other email:

It is explained in the beginning of myisam_backup_driver.cc:

   Now the Backup::prelock() request comes.
   To finish the backup, we need to synchronize (=read-lock) all tables
   ...
   thread, and when finally that thread has managed to get its locks, we
stop
   logging and reply backup::READY.
   ...

...
> +  /*
> +    Note: we are copying an index file of a table, which may have instances in
> +    the MySQL table cache, so after restore it will show up as
> +    "warning: 1 client is using or hasn't closed the table properly".
> +    Maybe do a quick index update on the table at the end of restore to
> +    remove this warning. But how to know if the problem pre-dates backup ?

@todo If this is a problem, one could compare TABLE_SHARE::ref_count and
MYISAM_SHARE::state.open_count at backup begin and log the difference.
This may be more complicated though. Not all cached TABLE objects need
to have open handlers.

But fixing this after restore seems more appropriate. When rebuilding
indexes, we repair anyway. So we could do a quick repair even when
copying indexes.

...
> +result_t File_backup::get_data(Buffer &buf)
> +{
> +  int       res;
> +  uint      howmuch= buf.size;
> +  result_t  ret= backup::OK;
> +
> +  DBUG_ENTER("myisam_backup::File_backup::get_data");
> +
> +  buf.size= 1;
> +  *buf.data= static_cast<uchar>(file_code);
> +  howmuch--;
> +  DBUG_ASSERT(howmuch > 0);

Too late? If buf.size was 0, it is now quite > 0.

...
> +  // for when we have "end of stream" notifications:
> +#if 0

Please replace '0' by something. E.g. TODO_HAVE_END_OF_STREAM

...
> +  DBUG_ENTER("myisam_backup::Table_restore::post_restore");
> +
> +  if (!rebuild_index)
> +    DBUG_RETURN(backup::OK); // nothing to do

@todo Perhaps a quick repair to get rid of the "open_count"?

...
> +  check_opt.init();
> +  check_opt.flags|= T_VERY_SILENT | T_CALC_CHECKSUM | T_QUICK;

I suggest to consider T_REP_BY_SORT. And perhaps even
thd->variables.myisam_repair_threads.

...

Regards
Ingo
-- 
Ingo Strüwing, Senior Software Developer
MySQL GmbH, Dachauer Str. 37, D-80335 München
Geschäftsführer: Kaj Arnö - HRB München 162140


Thread
bk commit into 5.2 tree (guilhem:1.2611)Guilhem Bichot6 Nov
  • Review II of MyISAM backup [Re: bk commit into 5.2 tree (guilhem:1.2611)]Ingo Strüwing21 Nov
    • Re: Review II of MyISAM backup [Re: bk commit into 5.2 tree(guilhem:1.2611)]Guilhem Bichot27 Nov
      • Re: Review II of MyISAM backup [Re: bk commit into 5.2 tree (guilhem:1.2611)]Ingo Strüwing27 Nov
  • Re: bk commit into 5.2 tree (guilhem:1.2611) WL#866Guilhem Bichot22 Nov