MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Rafal Somla Date:September 2 2008 2:28pm
Subject:Re: bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688)
Bug#39089 WL#4384
View as plain text  
STATUS
------
Changes requested

REQUESTS
--------
  1. Fix the problem with errors only partially logged by internal server 
functions (see below).
  2. Change the way errors in Global_iterator constructor are indicated. Also, 
make Backup_info::get_global() return NULL if the iterator could not be 
correctly constructed, instead of checking this outside.
  3. Change the error message for binlog position read operation since it can 
happen only during BACKUP operation.
  4. Try to consistently use "// Never errors" comment.
  5. Add information in Backup_restore_ctx::lock_tables_for_restore() that 
mk_table_list() logs errors (so we don't have to do it there).
  6. In Backup_restore_ctx::close() make sure that m_error is set as in the 
original code.
  7. In Inpit_stream::close() document return values.

COMMENTARY
----------

In few places you rely on internal functions (such as memory allocators) logging 
errors by pushing them on the error stack. This is not enough for our purposes. 
In online backup the primary destination of error messages are the online_backup 
log tables in mysql database. We should assume that something is logged only if 
backup::Logger::report_error() has been called. Only this ensures that the 
message has reached all the required destinations (including the error stack).

So, if a memory allocation function has pushed an error on the error stack, this 
is not yet reported from the perspective of online backup. The code which called 
the allocation function should detect the error and log it using the 
backup::Logger class.


Another issue: You wrote that init_alloc_root() never errors. This is strange - 
what if there  is not enough memory in the system?

SUGGESTIONS
-----------

1. Inside Backup_restore_ctx::restore_triggers_and_events() do validity checks 
after DBUG_ENTER().

2. Don not interrupt if close() fails inside Stream::open() methods.

3. Add new error messages at the end of errmsg.txt

DETAILS
-------

>       BUG#39089/WL#4384 Fix missing error handling in backup code.
>       Make sure errors returned from functions are reports are handled and/or
> logged.
> modified:
>   sql/backup/backup_aux.h
>   sql/backup/backup_info.cc
>   sql/backup/backup_kernel.h
>   sql/backup/data_backup.cc
>   sql/backup/image_info.cc
>   sql/backup/image_info.h
>   sql/backup/kernel.cc
>   sql/backup/stream.cc
>   sql/backup/stream.h
>   sql/mdl.cc
>   sql/share/errmsg.txt
> 
> per-file messages:
>   sql/backup/backup_aux.h
>     Report failures to allocate locks.
>   sql/backup/backup_info.cc
>     Report&handle allocation errors.
>     Require Global_iterator constructor take Backup_info parameter instead of
>     Image_info in order to be able to log errors.
>   sql/backup/backup_kernel.h
>     Made unlock_tables() void since it never returns errors.
>   sql/backup/data_backup.cc
>     Detect&Report error.
>     Make Block:writer:drop_buf() void since it never errors.
>     Add comments to explain where error handling is not necessary.
>   sql/backup/image_info.cc
>     Add comments to explain where error handling is not needed.
>   sql/backup/image_info.h
>     Make functions that never return errors void.
>     Add comments on error handling.
>   sql/backup/kernel.cc
>     Detect and handle previously undetected errors.
>     Add comments on error handling.
>   sql/backup/stream.cc
>     Add error handling for close methods.
>   sql/backup/stream.h
>      Close methods should return bool in order to be able to report errors.
>   sql/mdl.cc
>     More specific return value documentation for mdl_alloc_lock().
>   sql/share/errmsg.txt
>     New error codes for backup error handling.


> === modified file 'sql/backup/backup_aux.h'
> --- a/sql/backup/backup_aux.h	2008-08-21 19:28:49 +0000
> +++ b/sql/backup/backup_aux.h	2008-09-01 13:02:20 +0000
> @@ -6,11 +6,6 @@
>   
>    @brief Auxiliary declarations used in online backup code.
>  
> -  @todo Fix error detection in places marked with "FIXME: detect errors...". 
> -  These are places where functions or methods are called and if they can 
> -  report errors it should be detected and appropriate action taken. If callee 
> -  never reports errors or we want to ignore errors, a comment explaining this
> -  should be added.
>  */ 
>  
>  typedef st_plugin_int* storage_engine_ref;
> @@ -140,8 +135,8 @@ class String: public ::String
>  };
>  
>  inline
> -void set_table_list(TABLE_LIST &tl, const Table_ref &tbl, 
> -                    thr_lock_type lock_type, MEM_ROOT *mem)
> +int set_table_list(TABLE_LIST &tl, const Table_ref &tbl,
> +                   thr_lock_type lock_type, MEM_ROOT *mem)
>  {
>    DBUG_ASSERT(mem);
>  
> @@ -149,8 +144,12 @@ void set_table_list(TABLE_LIST &tl, cons
>    tl.db= const_cast<char*>(tbl.db().name().ptr());
>    tl.lock_type= lock_type;
>  
> -  // FIXME: detect errors (if NULL returned).
>    tl.mdl_lock_data= mdl_alloc_lock(0, tl.db, tl.table_name, mem); 
> +  if (!tl.mdl_lock_data)                    // Failed to allocate lock
> +  {
> +    return 1;
> +  }
> +  return 0;
>  }
>  
>  inline
> @@ -165,7 +164,10 @@ TABLE_LIST* mk_table_list(const Table_re
>       return NULL;
>  
>    bzero(ptr, sizeof(TABLE_LIST));
> -  set_table_list(*ptr, tbl, lock_type, mem);
> +  if (set_table_list(*ptr, tbl, lock_type, mem)) // Failed to allocate lock
> +  {
> +    return NULL;
> +  }
>  
>    return ptr;
>  }
> 

> === modified file 'sql/backup/backup_info.cc'
> --- a/sql/backup/backup_info.cc	2008-08-20 13:23:10 +0000
> +++ b/sql/backup/backup_info.cc	2008-09-01 13:02:20 +0000
> @@ -5,16 +5,6 @@
>    implements algorithm for selecting backup engine used to backup
>    given table.
>    
> -  @todo Fix error detection in places marked with "FIXME: detect errors...". 
> -  These are places where functions or methods are called and if they can 
> -  report errors it should be detected and appropriate action taken. If callee 
> -  never reports errors or we want to ignore errors, a comment explaining this
> -  should be added.
> -
> -  @todo Fix error logging in places marked with "FIXME: error logging...". In 
> -  these places it should be decided if and how the error should be shown to the
> -  user. If an error message should be logged, it can happen either in the place
> -  where error was detected or somewhere up the call stack.
>  */
>  
>  #include "../mysql_priv.h"
> @@ -317,14 +307,15 @@ Backup_info::Backup_info(Backup_restore_
>  
>    bzero(m_snap, sizeof(m_snap));
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  hash_init(&ts_hash, &::my_charset_bin, 16, 0, 0,
> -            Ts_hash_node::get_key, Ts_hash_node::free, MYF(0));
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  hash_init(&dep_hash, &::my_charset_bin, 16, 0, 0,
> -            Dep_node::get_key, Dep_node::free, MYF(0));
> +  if (hash_init(&ts_hash, &::my_charset_bin, 16, 0, 0,
> +                Ts_hash_node::get_key, Ts_hash_node::free, MYF(0))
> +      ||
> +      hash_init(&dep_hash, &::my_charset_bin, 16, 0, 0,
> +                Dep_node::get_key, Dep_node::free, MYF(0)))
> +  {
> +    // Allocation failed.  Error has been logged.
> +    return;
> +  }
>  
>    /* 
>      Create nodata, default, and CS snapshot objects and add them to the 
> @@ -332,35 +323,51 @@ Backup_info::Backup_info(Backup_restore_
>      element on that list, as a "catch all" entry. 
>     */
>  
> -  snap= new Nodata_snapshot(m_ctx);  // reports errors
> -
> -  // FIXME: error logging (in case snap could not be allocated).
> -  if (!snap || !snap->is_valid())
> +  snap= new Nodata_snapshot(m_ctx);             // logs errors
> +  if (!snap)
> +  {
> +    m_ctx.fatal_error(ER_OUT_OF_RESOURCES);
>      return;
> +  }
> +
> +  if (!snap->is_valid())
> +    return;    // Error has been logged by Nodata_snapshot constructor
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  snapshots.push_back(snap);
> +  if (snapshots.push_back(snap))
> +  {
> +    return;                                   // Error has been logged
>  
> -  snap= new CS_snapshot(m_ctx); // reports errors
> +  }
>  
> -  // FIXME: error logging (in case snap could not be allocated).
> -  if (!snap || !snap->is_valid())
> +  snap= new CS_snapshot(m_ctx);                 // logs errors
> +  if (!snap)
> +  {
> +    m_ctx.fatal_error(ER_OUT_OF_RESOURCES);
>      return;
> +  }
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  snapshots.push_back(snap);
> +  if (!snap->is_valid())
> +    return;        // Error has been logged by CS_snapshot constructor
>  
> -  snap= new Default_snapshot(m_ctx);  // reports errors
> +  if (snapshots.push_back(snap))
> +  {
> +    return;                                   // Error has been logged
> +  }
>  
> -  // FIXME: error logging (in case snap could not be allocated).
> -  if (!snap || !snap->is_valid())
> +  snap= new Default_snapshot(m_ctx);            // logs errors
> +  if (!snap)
> +  {
> +    m_ctx.fatal_error(ER_OUT_OF_RESOURCES);
>      return;
> +  }
> +
> +  if (!snap->is_valid())
> +    return;  // Error has been logged by Deafault_snapshot constructor
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  snapshots.push_back(snap);
> +  if (snapshots.push_back(snap))
> +  {
> +    return;                                   // Error has been logged
> +  }
>  
>    m_state= CREATED;
>  }
> @@ -1297,7 +1304,7 @@ class Backup_info::Global_iterator
>  
>   public:
>  
> -  Global_iterator(const Image_info&);
> +  Global_iterator(const Backup_info&);
>  
>   private:
>  
> @@ -1306,11 +1313,15 @@ class Backup_info::Global_iterator
>  };
>  
>  inline
> -Backup_info::Global_iterator::Global_iterator(const Image_info &info)
> +Backup_info::Global_iterator::Global_iterator(const Backup_info &info)
>   :Iterator(info), mode(TABLESPACES), m_it(NULL), m_obj(NULL)
>  {
> -  // FIXME: detect errors (if NULL returned).
>    m_it= m_info.get_tablespaces();
> +  if (!m_it)
> +  {
> +    // Logs error, caller must check that ctx is valid afterwards

I'd prefer to have this indicated without relying on the ctx object. For example 
define ERROR mode for Global_iterator.

> +    info.m_ctx.fatal_error(ER_BACKUP_CAT_ENUM);
> +  }
>    next();       // Note: next() doesn't report errors.
>  }
>  
> 
> === modified file 'sql/backup/backup_kernel.h'
> --- a/sql/backup/backup_kernel.h	2008-08-27 17:30:49 +0000
> +++ b/sql/backup/backup_kernel.h	2008-09-01 13:02:20 +0000
> @@ -127,7 +127,7 @@ class Backup_restore_ctx: public backup:
>    bool m_tables_locked; 
>  
>    int lock_tables_for_restore();
> -  int unlock_tables();
> +  void unlock_tables();
>    
>    friend class Backup_info;
>    friend class Restore_info;
> 

> === modified file 'sql/backup/data_backup.cc'
> --- a/sql/backup/data_backup.cc	2008-08-27 17:30:49 +0000
> +++ b/sql/backup/data_backup.cc	2008-09-01 13:02:20 +0000
> @@ -7,17 +7,6 @@
>    drivers and protocols to create snapshot of the data stored in the tables being
>    backed up.
>  
> -  @todo Fix error detection in places marked with "FIXME: detect errors...". 
> -  These are places where functions or methods are called and if they can 
> -  report errors it should be detected and appropriate action taken. If callee 
> -  never reports errors or we want to ignore errors, a comment explaining this
> -  should be added.
> -
> -  @todo Fix error logging in places marked with "FIXME: error logging...". In 
> -  these places it should be decided if and how the error should be shown to the
> -  user. If an error message should be logged, it can happen either in the place
> -  where error was detected or somewhere up the call stack.
> -
>    @todo Implement better scheduling strategy in Scheduler::step
>    @todo Add error reporting in the scheduler and elsewhere
>    @todo If an error from driver is ignored (and operation retried) leave trace
> @@ -104,7 +93,7 @@ class Block_writer
>  
>    result_t  get_buf(Buffer &);
>    result_t  write_buf(const Buffer&);
> -  result_t  drop_buf(Buffer&);
> +  void      drop_buf(Buffer&);
>  
>    Block_writer(byte, size_t, Output_stream&);
>    ~Block_writer();
> @@ -477,9 +466,10 @@ int write_table_data(THD* thd, Backup_in
>        if (init_size > max_init_size)
>          max_init_size= init_size;
>  
> -      // FIXME: detect errors if reported.
> -      // FIXME: error logging.
> -      inactive.push_back(p);
> +      if (inactive.push_back(p))
> +      {
> +        goto error;                           // Error has been logged
> +      }
>      }
>    }
>  
> @@ -599,9 +589,13 @@ int write_table_data(THD* thd, Backup_in
>        Save binlog information for point in time recovery on restore.
>      */
>      if (mysql_bin_log.is_open())
> -      // FIXME: detect errors if reported.
> -      // FIXME: error logging.
> -      mysql_bin_log.get_current_log(&binlog_pos);
> +      if (mysql_bin_log.get_current_log(&binlog_pos))
> +      {
> +        info.m_ctx.fatal_error(ER_BACKUP_BINLOG,
> +                               info.m_ctx.m_type == backup::Logger::BACKUP 
> +                               ? "BACKUP" : "RESTORE");

This can happen only during BACKUP operation. We never read binlog position 
during RESTORE.

> +        goto error;
> +      }
>  
>      /*
>        Save VP creation time.
> @@ -788,8 +782,7 @@ int Scheduler::step()
>        break;
>  
>      case backup_state::DONE:
> -      // FIXME: detect errors if reported.
> -      p->end();
> +      res= p->end();  // Logs errors, fall-through to error handling below
>  
>      case backup_state::ERROR:
>        remove_pump(p);   // Note: never errors.
> @@ -1014,11 +1007,14 @@ Backup_pump::Backup_pump(Snapshot_info &
>  {
>    DBUG_ASSERT(snap.m_num > 0);
>    m_buf.data= NULL;
> -  // FIXME: detect errors if reported.
> -  bitmap_init(&m_closed_streams,
> +  if (bitmap_init(&m_closed_streams,
>                NULL,
>                1 + snap.table_count(),
> -              FALSE); // not thread safe
> +              FALSE)) // not thread safe
> +  {
> +    state= backup_state::ERROR;  // Created object will be invalid
> +  }
> +
>    m_name= snap.name();
>    if (ERROR == snap.get_backup_driver(m_drv) || !m_drv)
>      state= backup_state::ERROR;
> @@ -1253,10 +1249,9 @@ int Backup_pump::pump(size_t *howmuch)
>  
>          if (m_buf.size > 0)
>            mode= WRITING;
> -        else
> -          // FIXME: detect errors (perhaps ensure that drop_buf never errors).
> -          // FIXME: error logging.
> -          m_bw.drop_buf(m_buf);
> +        else {
> +          m_bw.drop_buf(m_buf);              // Never errors
> +        }

The braces violate our coding conventions. Not that I care much about that but 
others might...

>  
>          break;
>  
> @@ -1274,9 +1269,7 @@ int Backup_pump::pump(size_t *howmuch)
>          state= backup_state::DONE;
>  
>        case BUSY:
> -        // FIXME: detect errors (perhaps ensure that drop_buf never errors).
> -        // FIXME: error logging.
> -        m_bw.drop_buf(m_buf);
> +        m_bw.drop_buf(m_buf);                   // Never errors
>          m_buf_head=NULL;  // thus a new request will be made
>        }
>  
> @@ -1531,9 +1524,7 @@ int restore_table_data(THD *thd, Restore
>            bad_drivers.append(",");
>          bad_drivers.append(info.m_snap[n]->name());
>        }
> -      // FIXME: detect errors.
> -      // FIXME: error logging.
> -      drv[n]->free();
> +      drv[n]->free();                           // Never errors
>      }
>  
>      if (!bad_drivers.is_empty())
> @@ -1551,9 +1542,7 @@ int restore_table_data(THD *thd, Restore
>      if (!drv[n])
>        continue;
>  
> -    // FIXME: detect errors.
> -    // FIXME: error logging (perhaps we don't want to report errors here).
> -    drv[n]->free();
> +    drv[n]->free();  // Does not report errors

Let's use "Never errors" consistently.

>    }
>  
>    DBUG_RETURN(backup::ERROR);
> @@ -1651,13 +1640,12 @@ Block_writer::write_buf(const Buffer &bu
>    method can return it to the buffer pool so that it can be reused for other
>    transfers.
>   */
> -Block_writer::result_t
> +void
>  Block_writer::drop_buf(Buffer &buf)
>  {
>    buf.data= NULL;
>    buf.size= 0;
>    taken= FALSE;
> -  return OK;
>  }
>  
>  } // backup namespace
> 

> === modified file 'sql/backup/image_info.cc'
> --- a/sql/backup/image_info.cc	2008-08-20 13:23:10 +0000
> +++ b/sql/backup/image_info.cc	2008-09-01 13:02:20 +0000
> @@ -8,16 +8,6 @@
>  
>    @brief Implements @c Image_info class and friends.
>  
> -  @todo Fix error detection in places marked with "FIXME: detect errors...". 
> -  These are places where functions or methods are called and if they can 
> -  report errors it should be detected and appropriate action taken. If callee 
> -  never reports errors or we want to ignore errors, a comment explaining this
> -  should be added.
> -
> -  @todo Fix error logging in places marked with "FIXME: error logging...". In 
> -  these places it should be decided if and how the error should be shown to the
> -  user. If an error message should be logged, it can happen either in the place
> -  where error was detected or somewhere up the call stack.
>  */
>  
>  namespace backup {
> @@ -25,8 +15,7 @@ namespace backup {
>  Image_info::Image_info()
>    :data_size(0), m_table_count(0), m_dbs(16, 16), m_ts_map(16,16)
>  {
> -  // FIXME: detect errors if reported.
> -  init_alloc_root(&mem_root, 4 * 1024, 0);
> +  init_alloc_root(&mem_root, 4 * 1024, 0);      // Never errors

This is strange. What if there is not enough memory in the system?

>  
>    /* initialize st_bstream_image_header members */
>  
> @@ -308,10 +297,8 @@ Image_info::add_table(Db &db, const ::St
>  
>    if (snap.add_table(*t, pos))  // reports errors
>      return NULL;
> -  
> -  // FIXME: error logging.
> -  if (db.add_table(*t))
> -    return NULL;
> +
> +  db.add_table(*t);                             // Never errors
>  
>    if (!snap.m_num)
>      snap.m_num= add_snapshot(snap); // reports errors
> 


> === modified file 'sql/backup/image_info.h'
> --- a/sql/backup/image_info.h	2008-08-20 13:23:10 +0000
> +++ b/sql/backup/image_info.h	2008-09-01 13:02:20 +0000
> @@ -4,10 +4,6 @@
>  /**
>    @file
>    
> -  @todo Fix error logging in places marked with "FIXME: error logging...". In 
> -  these places it should be decided if and how the error should be shown to the
> -  user. If an error message should be logged, it can happen either in the place
> -  where error was detected or somewhere up the call stack.
>  */ 
>  
>  #include <si_objects.h>
> @@ -422,7 +418,7 @@ class Image_info::Db
>    obs::Obj* materialize(uint ver, const ::String &sdata);
>    result_t add_obj(Dbobj&, ulong pos);
>    Dbobj*   get_obj(ulong pos) const;
> -  result_t add_table(Table&);
> +  void add_table(Table&);
>    const char* describe(describe_buf&) const;
>  
>   private:
> @@ -811,24 +807,21 @@ void Image_info::save_binlog_pos(const :
>  inline
>  Image_info::Db_iterator* Image_info::get_dbs() const
>  {
> -  // FIXME: error logging (in case allocation fails).
> -  return new Db_iterator(*this);
> +  return new Db_iterator(*this);  // Let caller handle allocation failures

And do the callers handle these failures? This is perhaps a question to check 
when you have another scan of the code.

>  }
>  
>  /// Returns an iterator enumerating all tablespaces stored in backup catalogue.
>  inline
>  Image_info::Ts_iterator* Image_info::get_tablespaces() const
>  {
> -  // FIXME: error logging (in case allocation fails).
> -  return new Ts_iterator(*this);
> +  return new Ts_iterator(*this);  // Let caller handle allocation failures
>  }
>  
>  /// Returns an iterator enumerating all objects in a given database.
>  inline
>  Image_info::Dbobj_iterator* Image_info::get_db_objects(const Db &db) const
>  {
> -  // FIXME: error logging (in case allocation fails).
> -  return new Dbobj_iterator(*this, db);
> +  return new Dbobj_iterator(*this, db); // Let caller handle allocation failures
>  }
>  
>  /********************************************************************
> @@ -1036,7 +1029,7 @@ obs::Obj* Image_info::Dbobj::materialize
>    The table is appended to database's table list.
>   */
>  inline
> -result_t Image_info::Db::add_table(Table &tbl)
> +void Image_info::Db::add_table(Table &tbl)
>  {
>    tbl.next_table= NULL;
>    
> @@ -1047,8 +1040,6 @@ result_t Image_info::Db::add_table(Table
>      last_table->next_table= &tbl;
>      last_table= &tbl;
>    }
> -  
> -  return OK;
>  }
>  
>  /**
> 

> === modified file 'sql/backup/kernel.cc'
> --- a/sql/backup/kernel.cc	2008-08-27 17:30:49 +0000
> +++ b/sql/backup/kernel.cc	2008-09-01 13:02:20 +0000
> @@ -52,17 +52,6 @@
>    } // if code jumps here, context destructor will do the clean-up automatically
>    @endcode
>  
> -  @todo Fix error detection in places marked with "FIXME: detect errors...". 
> -  These are places where functions or methods are called and if they can 
> -  report errors it should be detected and appropriate action taken. If callee 
> -  never reports errors or we want to ignore errors, a comment explaining this
> -  should be added.
> -
> -  @todo Fix error logging in places marked with "FIXME: error logging...". In 
> -  these places it should be decided if and how the error should be shown to the
> -  user. If an error message should be logged, it can happen either in the place
> -  where error was detected or somewhere up the call stack.
> -
>    @todo Use internal table name representation when passing tables to
>          backup/restore drivers.
>    @todo Handle other types of meta-data in Backup_info methods.
> @@ -281,8 +270,10 @@ int send_error(Backup_restore_ctx &log, 
>  /**
>    Send positive reply after a backup/restore operation.
>  
> -  Currently the id of the operation is returned. It can be used to select
> -  correct entries form the backup progress tables.
> +  Currently the id of the operation is returned to the client. It can
> +  be used to select correct entries from the backup progress tables.
> +
> +  @returns 0 on success, error code otherwise.
>  */
>  int send_reply(Backup_restore_ctx &context)
>  {
> @@ -295,34 +286,37 @@ int send_reply(Backup_restore_ctx &conte
>    /*
>      Send field list.
>    */
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  field_list.push_back(new Item_empty_string(STRING_WITH_LEN("backup_id")));
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  protocol->send_fields(&field_list, Protocol::SEND_NUM_ROWS |
> Protocol::SEND_EOF);
> +  if (field_list.push_back(new Item_empty_string(STRING_WITH_LEN("backup_id"))))
> +  {
> +    goto err;
> +  }
> +  if (protocol->send_fields(&field_list,
> +                            Protocol::SEND_NUM_ROWS | Protocol::SEND_EOF))
> +  {
> +    goto err;
> +  }
>  
>    /*
>      Send field data.
>    */
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  protocol->prepare_for_resend();
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  llstr(context.op_id(), buf);
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  protocol->store(buf, system_charset_info);
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  protocol->write();
> -
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  my_eof(context.thd());
> -  context.report_cleanup();
> +  protocol->prepare_for_resend();               // Never errors
> +  llstr(context.op_id(), buf);                  // Never errors
> +  if (protocol->store(buf, system_charset_info))
> +  {
> +    goto err;
> +  }
> +  if (protocol->write())
> +  {
> +    goto err;
> +  }
> +  my_eof(context.thd());                        // Never errors
> +  context.report_cleanup();                     // Never errors
>    DBUG_RETURN(0);
> +
> + err:
> +  DBUG_RETURN(context.fatal_error(ER_BACKUP_SEND_REPLY,
> +                                  context.m_type == backup::Logger::BACKUP
> +                                  ? "BACKUP" : "RESTORE"));
>  }
>  
>  
> @@ -406,13 +400,10 @@ int Backup_restore_ctx::prepare(LEX_STRI
>    
>    // Prepare error reporting context.
>    
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  mysql_reset_errors(m_thd, 0);
> +  mysql_reset_errors(m_thd, 0);                 // Never errors
>    m_thd->no_warnings_for_error= FALSE;
> -  // FIXME: detect errors if  reported.
> -  // FIXME: error logging.
> -  save_errors();  
> +
> +  save_errors();                                // Never errors
>  
>  
>    /*
> @@ -566,7 +557,7 @@ Backup_restore_ctx::prepare_for_backup(S
>      Create backup catalogue.
>     */
>  
> -  Backup_info *info= new Backup_info(*this); // reports errors
> +  Backup_info *info= new Backup_info(*this);    // Never errors
>  
>    if (!info)
>    {
> @@ -575,7 +566,7 @@ Backup_restore_ctx::prepare_for_backup(S
>    }
>  
>    if (!info->is_valid())
> -    return NULL;
> +    return NULL;    // Error has been logged by Backup_Info contructor
>  
>    info->save_start_time(when);
>    m_catalog= info;
> @@ -731,14 +722,14 @@ int Backup_restore_ctx::lock_tables_for_
>        backup::Image_info::Table *tbl= snap->get_table(t);
>        DBUG_ASSERT(tbl); // All tables should be present in the catalogue.
>  
> -      // FIXME: detect errors. Don't assert here but report error instead.
> -      // FIXME: error logging.
>        TABLE_LIST *ptr= backup::mk_table_list(*tbl, TL_WRITE, m_thd->mem_root);

Mark this function as "logs errors"...

> -      DBUG_ASSERT(ptr);
> +      if (!ptr)
> +      {
> +        // Failed to allocate (failure has been logged), return error

Hmm, the memory allocation functions push errors on the error stack, but this is 
not the same as backup error reporting. We should say "has been logged" if it 
has been logged via our online backup logging mechanisms.

> +        return ER_OUT_OF_RESOURCES;
> +      }
>  
> -      // FIXME: detect errors if reported.
> -      // FIXME: error logging.
> -      tables= backup::link_table_list(*ptr, tables);      
> +      tables= backup::link_table_list(*ptr, tables); // Never errors
>        tbl->m_table= ptr;
>      }
>    }
> @@ -763,18 +754,18 @@ int Backup_restore_ctx::lock_tables_for_
>  /**
>    Unlock tables which were locked by @c lock_tables_for_restore.
>   */ 
> -int Backup_restore_ctx::unlock_tables()
> +void Backup_restore_ctx::unlock_tables()
>  {
>    // Do nothing if tables are not locked.
>    if (!m_tables_locked)
> -    return 0;
> +    return;
>  
>    DBUG_PRINT("restore",("unlocking tables"));
>  
> -  close_thread_tables(m_thd);
> +  close_thread_tables(m_thd);                   // Never errors
>    m_tables_locked= FALSE;
>  
> -  return 0;
> +  return;
>  }
>  
>  /**
> @@ -791,6 +782,7 @@ int Backup_restore_ctx::unlock_tables()
>   */ 
>  int Backup_restore_ctx::close()

The original implementation of this method was setting m_error member which has 
a meaning - it indicates that the context object is in error state. Your patch 
changes this and I'm not convinced that this will have no bad effects.

>  {
> +  int error= 0;
>    if (m_state == CLOSED)
>      return 0;
>  
> @@ -810,15 +802,10 @@ int Backup_restore_ctx::close()
>    }
>  
>    // unlock tables if they are still locked
> -
> -  // FIXME: detect errors if reported.
> -  unlock_tables();
> +  unlock_tables();                              // Never errors
>  
>    // unfreeze meta-data
> -
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  obs::ddl_blocker_disable();
> +  obs::ddl_blocker_disable();                   // Never errors
>  
>    // restore thread options
>  
> @@ -826,10 +813,11 @@ int Backup_restore_ctx::close()
>  
>    // close stream
>  
> -  if (m_stream)
> -    // FIXME: detect errors if reported.
> -    // FIXME: error logging.
> -    m_stream->close();
> +  if (m_stream && !m_stream->close())
> +  {
> +    // Note error, but complete clean-up
> +    error= ER_BACKUP_CLOSE;
> +  }
>  
>    if (m_catalog)
>      m_catalog->save_end_time(when); // Note: no errors.
> @@ -861,19 +849,27 @@ int Backup_restore_ctx::close()
>       */
>      if (res && my_errno != ENOENT)
>      {
> -      report_error(ER_CANT_DELETE_FILE, m_path, my_errno);
> -      if (!m_error)
> -        m_error= ER_CANT_DELETE_FILE;
> +      error= ER_CANT_DELETE_FILE;

Originally I was setting m_error to ER_CANT_DELETE_FILE only if no error was 
detected earlier. This is because in the situation when we have detected an 
error and then have another error while deleting the file, it is more important 
to report the first error IMO.

>      }
>    }
>  
> -  // We report completion of the operation only if no errors were detected.
> -
> -  if (!m_error)
> -    report_stop(when, TRUE);
> +  /* We report completion of the operation only if no errors were detected,
> +     and logger has been initialized.
> +  */
> +  if (!error)
> +  {
> +    if (backup::Logger::m_state == backup::Logger::RUNNING)
> +    {
> +      report_stop(when, TRUE);
> +    }
> +  }
> +  else
> +  {
> +    fatal_error(error);                         // Log error
> +  }
>  
>    m_state= CLOSED;
> -  return m_error;
> +  return error;
>  }
>  
>  /**
> @@ -901,9 +897,7 @@ int Backup_restore_ctx::do_backup()
>  
>    DEBUG_SYNC(m_thd, "before_backup_meta");
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  report_stats_pre(info);
> +  report_stats_pre(info);                       // Never errors
>  
>    DBUG_PRINT("backup",("Writing preamble"));
>    DEBUG_SYNC(m_thd, "backup_before_write_preamble");
> @@ -918,7 +912,7 @@ int Backup_restore_ctx::do_backup()
>  
>    DEBUG_SYNC(m_thd, "before_backup_data");
>  
> -  if (write_table_data(m_thd, info, s)) // reports errors
> +  if (write_table_data(m_thd, info, s)) // logs errors
>      DBUG_RETURN(send_error(*this, ER_BACKUP_BACKUP));
>  
>    DBUG_PRINT("backup",("Writing summary"));
> @@ -929,9 +923,7 @@ int Backup_restore_ctx::do_backup()
>      DBUG_RETURN(m_error);
>    }
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  report_stats_post(info);
> +  report_stats_post(info);                      // Never errors
>  
>    DBUG_PRINT("backup",("Backup done."));
>    DEBUG_SYNC(m_thd, "before_backup_done");
> @@ -959,8 +951,11 @@ int Backup_restore_ctx::restore_triggers
>  
>    DBUG_ASSERT(m_catalog);
>  
> -  // FIXME: detect errors (when dbit==NULL). Perhaps just assert.
> -  Image_info::Iterator *dbit= m_catalog->get_dbs();
> +  Image_info::Iterator *dbit=m_catalog->get_dbs();
> +  if (!dbit)
> +  {
> +    return fatal_error(ER_BACKUP_CAT_ENUM);
> +  }

Maybe better to do this check after all the variables have been declared and 
after DBUG_ENTER().

>    Image_info::Obj *obj;
>    List<Image_info::Obj> events;
>    Image_info::Obj::describe_buf buf;
> @@ -971,18 +966,20 @@ int Backup_restore_ctx::restore_triggers
>    
>    while ((obj= (*dbit)++)) 
>    {
> -    // FIXME: detect errors (when it==NULL). Perhaps just assert.
> -    Image_info::Iterator *it= 
> +    Image_info::Iterator *it=
>                     
> m_catalog->get_db_objects(*static_cast<Image_info::Db*>(obj));
> -
> +    if (!it) {
> +      DBUG_RETURN(fatal_error(ER_BACKUP_LIST_DB_TABLES));
> +    }
>      while ((obj= (*it)++))
>        switch (obj->type()) {
>        
>        case BSTREAM_IT_EVENT:
>          DBUG_ASSERT(obj->m_obj_ptr);
> -        // FIXME: detect errors if reported.
> -        // FIXME: error logging.
> -        events.push_back(obj);
> +        if (events.push_back(obj))
> +        {
> +          DBUG_RETURN(ER_OUT_OF_RESOURCES);     // Error has been logged

Not to all the online backup error destinations.

> +        }
>          break;
>        
>        case BSTREAM_IT_TRIGGER:
> @@ -1044,27 +1041,24 @@ int Backup_restore_ctx::do_restore()
>    Input_stream &s= *static_cast<Input_stream*>(m_stream);
>    Restore_info &info= *static_cast<Restore_info*>(m_catalog);
>  
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  report_stats_pre(info);
> +  report_stats_pre(info);                       // Never errors
>  
>    DBUG_PRINT("restore", ("Restoring meta-data"));
>  
> -  // FIXME: detect errors if reported.
> -  disable_fkey_constraints();  // reports errors
> +  disable_fkey_constraints();                   // Never errors
>  
>    if (read_meta_data(info, s))
>    {
> -    // FIXME: detect errors.
> -    // FIXME: error logging.
> -    m_thd->main_da.reset_diagnostics_area();
> +    m_thd->main_da.reset_diagnostics_area();    // Never errors
>  
>      fatal_error(ER_BACKUP_READ_META);
>      DBUG_RETURN(m_error);
>    }
>  
> -  // FIXME: detect errors.
> -  s.next_chunk();
> +  if (s.next_chunk() == BSTREAM_ERROR)
> +  {
> +    DBUG_RETURN(fatal_error(ER_BACKUP_NEXT_CHUNK));
> +  }
>  
>    DBUG_PRINT("restore",("Restoring table data"));
>  
> @@ -1074,21 +1068,16 @@ int Backup_restore_ctx::do_restore()
>      It should be fixed inside object services implementation and then the
>      following line should be removed.
>     */
> -  // FIXME: detect errors.
> -  // FIXME: error logging.
> -  close_thread_tables(m_thd);
> -  // FIXME: detect errors.
> -  // FIXME: error logging.
> -  m_thd->main_da.reset_diagnostics_area();
> +  close_thread_tables(m_thd);                   // Never errors
> +  m_thd->main_da.reset_diagnostics_area();      // Never errors  
>  
> -  if (lock_tables_for_restore()) // reports errors
> +  if (lock_tables_for_restore())                // logs errors
>      DBUG_RETURN(m_error);
>  
>    // Here restore drivers are created to restore table data
>    err= restore_table_data(m_thd, info, s); // reports errors
>  
> -  // FIXME: detect errors if reported.
> -  unlock_tables();
> +  unlock_tables();                              // Never errors
>  
>    if (err)
>      DBUG_RETURN(ER_BACKUP_RESTORE);
> @@ -1117,16 +1106,10 @@ int Backup_restore_ctx::do_restore()
>      It should be fixed inside object services implementation and then the
>      following line should be removed.
>     */
> -  // FIXME: detect errors.
> -  // FIXME: error logging.
> -  close_thread_tables(m_thd);
> -  // FIXME: detect errors.
> -  // FIXME: error logging.
> -  m_thd->main_da.reset_diagnostics_area();
> -
> -  // FIXME: detect errors if reported.
> -  // FIXME: error logging.
> -  report_stats_post(info);
> +  close_thread_tables(m_thd);                   // Never errors
> +  m_thd->main_da.reset_diagnostics_area();      // Never errors
> +
> +  report_stats_post(info);                      // Never errors
>  
>    DBUG_RETURN(0);
>  }
> @@ -1530,10 +1513,12 @@ void* bcat_iterator_get(st_bstream_image
>    case BSTREAM_IT_GLOBAL:   // all global objects
>    {
>      Iterator *it= info->get_global();
> -  
>      if (!it)
>      {
>        info->m_ctx.fatal_error(ER_BACKUP_CAT_ENUM);
> +    }
> +    if (!info->m_ctx.is_valid())  // Fatal error has occurred
> +    {
>        return NULL;
>      }

I'd prefer to have get_globl() resturn NULL if there any problems with creating 
the iterator. In case Global_iterator constructor fails, please indicate it in 
the constructed instance (for example you can define new ERROR mode). Otherwise 
replace constructor with an initialization function. This is to avoid the 
situation where the Logger is used to detect errors in constructors - I think 
this is not a good practice, but convince me if you think otherwise.

Some_class instance(ctx);

/*
   Id like to know that instance is not properly constructed without
   relying on the fact that an error was reported using ctx...
*/

>  
> 
> === modified file 'sql/backup/stream.cc'
> --- a/sql/backup/stream.cc	2008-08-08 17:21:31 +0000
> +++ b/sql/backup/stream.cc	2008-09-01 13:02:20 +0000
> @@ -349,18 +349,26 @@ int Stream::prepare_path(::String *backu
>  
>  bool Stream::open()
>  {
> -  close();
> +  if (!close())
> +  {
> +    return false;
> +  }

Perhaps we should ignore error here. If we fail to close the stream, lets try to 
open it anyway (maybe open operation will do implicit close for us).

>    m_fd= my_open(m_path.c_ptr(), m_flags, MYF(0));
>    return m_fd >= 0;
>  }
>  
> -void Stream::close()
> +bool Stream::close()
>  {
> +  bool ret= true;
>    if (m_fd >= 0)
>    {
> -    my_close(m_fd, MYF(0));
> +    if (my_close(m_fd, MYF(0)))
> +    {
> +      ret= false;
> +    }
>      m_fd= -1;
>    }
> +  return ret;
>  }
>  
>  bool Stream::rewind()
> @@ -453,7 +461,10 @@ bool Output_stream::init()
>  bool Output_stream::open()
>  {
>    MY_STAT stat_info;
> -  close();
> +  if (!close())
> +  {
> +    return false;
> +  }

Perhaps we should ignore open error as above.

>  
>    /* Allow to write to existing named pipe */
>    if (my_stat(m_path.c_ptr(), &stat_info, MYF(0)) &&
> @@ -504,12 +515,17 @@ bool Output_stream::open()
>  
>    If @c destroy is TRUE, the stream object is deleted.
>  */
> -void Output_stream::close()
> +bool Output_stream::close()
>  {
> +  bool ret= true;
>    if (m_fd < 0)
> -    return;
> +    return true;
> +
> +  if (bstream_close(this) == BSTREAM_ERROR)
> +  {
> +    ret= false;
> +  }
>  
> -  bstream_close(this);
>  #ifdef HAVE_COMPRESS
>    if (m_with_compression)
>    {
> @@ -538,7 +554,8 @@ void Output_stream::close()
>      my_free(zbuf, MYF(0));
>    }
>  #endif
> -  Stream::close();
> +  ret &= Stream::close();
> +  return ret;
>  }
>  
>  /**
> @@ -634,7 +651,10 @@ bool Input_stream::init()
>  */
>  bool Input_stream::open()
>  {
> -  close();
> +  if (!close())
> +  {
> +    return false;
> +  }
>  
>    bool ret= Stream::open();
>  
> @@ -685,12 +705,17 @@ bool Input_stream::open()
>  
>    If @c destroy is TRUE, the stream object is deleted.

Please document return values (TRUE = success).

>  */
> -void Input_stream::close()
> +bool Input_stream::close()
>  {
> +  bool ret= true;
>    if (m_fd < 0)
> -    return;
> +    return true;

Not 100% sure but perhaps we should use TRUE and FALSE to make code more portable.

> +
> +  if (bstream_close(this) == BSTREAM_ERROR)
> +  {
> +    ret= false;
> +  }
>  
> -  bstream_close(this);
>  #ifdef HAVE_COMPRESS
>    if (m_with_compression)
>    {
> @@ -700,7 +725,8 @@ void Input_stream::close()
>      my_free(zbuf, (MYF(0)));
>    }
>  #endif
> -  Stream::close();
> +  ret &= Stream::close();
> +  return ret;
>  }
>  
>  /**
> 
> === modified file 'sql/backup/stream.h'
> --- a/sql/backup/stream.h	2008-08-08 17:21:31 +0000
> +++ b/sql/backup/stream.h	2008-09-01 13:02:20 +0000
> @@ -80,7 +80,7 @@ class Stream: public fd_stream
>   public:
>  
>    bool open();
> -  virtual void close();
> +  virtual bool close();
>    bool rewind();
>  
>    /// Check if stream is opened
> @@ -121,7 +121,7 @@ class Output_stream:
>    Output_stream(Logger&, ::String *, LEX_STRING, bool);
>  
>    bool open();
> -  void close();
> +  bool close();
>    bool rewind();
>  
>   private:
> @@ -139,7 +139,7 @@ class Input_stream:
>    Input_stream(Logger&, ::String *, LEX_STRING);
>  
>    bool open();
> -  void close();
> +  bool close();
>    bool rewind();
>  
>    int next_chunk();
> 
> === modified file 'sql/mdl.cc'
> --- a/sql/mdl.cc	2008-07-07 15:51:20 +0000
> +++ b/sql/mdl.cc	2008-09-01 13:02:20 +0000
> @@ -294,7 +294,7 @@ void mdl_init_lock(MDL_LOCK_DATA *lock_d
>  
>     @note The allocated lock request will have MDL_SHARED type.
>  
> -   @retval 0      Error
> +   @retval 0      Error if out of memory
>     @retval non-0  Pointer to an object representing a lock request
>  */
>  
> 
> === modified file 'sql/share/errmsg.txt'
> --- a/sql/share/errmsg.txt	2008-08-27 17:30:49 +0000
> +++ b/sql/share/errmsg.txt	2008-09-01 13:02:20 +0000
> @@ -6388,7 +6388,20 @@ ER_BACKUP_GRANT_SKIPPED
>          eng "The grant '%-.64s' was skipped because the user does not exist."
>  ER_BACKUP_GRANT_WRONG_DB
>          eng "The grant '%-.64s' failed. Database not included in the backup image."
> +ER_BACKUP_SEND_REPLY
> +        eng "Failed to send reply to client after successful %-.64s operation"
> +        nor "Mislykket sending av svar til klient etter %-.64s"
> +        norwegian-ny "Sendinga av svar til klienten etter %-.64s"
> +ER_BACKUP_CLOSE
> +        eng "Backup/Restore: Error on close of backup stream"
> +        nor "Backup/Restore: Feil ved lukning av backup-strøm"
> +        norwegian-ny "Backup/Restore:  Feil ved lukking av backup-straum"
> +ER_BACKUP_BINLOG
> +        eng "Error on accessing binlog during %-.64s"
> +        nor "Lesing av binlog feilet under %-.64s"
> +        norwegian-ny "Lesing av binlog feila under %-.64s"

Better put these at the end of the file.

>  ER_BACKUP_LOGPATHS
>    eng "The log names for backup_history and backup_progress must be unique."
>  ER_BACKUP_LOGPATH_INVALID
>    eng "The path specified for the %-.64s is invalid. ref: %-.64s"
> +
> 
Thread
bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688) Bug#39089WL#4384oystein.grovlen1 Sep
  • Re: bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688)Bug#39089 WL#4384Rafal Somla2 Sep
    • Re: bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688)Bug#39089 WL#4384Øystein Grøvlen4 Sep
      • Re: bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688)Bug#39089 WL#4384Rafal Somla4 Sep
      • Re: bzr commit into mysql-6.0-backup branch (oystein.grovlen:2688)Bug#39089 WL#4384Rafal Somla4 Sep