List:Commits« Previous MessageNext Message »
From:Chuck Bell Date:May 14 2008 1:19am
Subject:RE: bk commit into 6.0 tree (rafal:1.2614) WL#4211
View as plain text  
Rafal,

I'm ok with your responses and changes. Patch approved.

Chuck 

> -----Original Message-----
> From: Rafal Somla [mailto:rsomla@stripped] 
> Sent: Tuesday, May 13, 2008 6:55 AM
> To: Chuck Bell
> Cc: commits@stripped
> Subject: Re: bk commit into 6.0 tree (rafal:1.2614) WL#4211
> 
> Hi Chuck,
> 
> Thanks for your review! See my replies below.
> 
> Rafal
> 
> Chuck Bell wrote:
> > Rafal,
> > 
> > I have some questions and suggestions: 
> > 
> >> diff -Nrup a/mysql-test/t/backup_objects.test
> >> b/mysql-test/t/backup_objects.test
> >> --- a/mysql-test/t/backup_objects.test	2008-02-13 
> >> 12:40:24 +01:00
> >> +++ b/mysql-test/t/backup_objects.test	2008-05-07 
> >> 15:07:25 +02:00
> >> @@ -1,12 +1,36 @@
> >>  #
> >> -# This test testing backup and restore for database objects.
> >> +# This test is testing backup and restore of database objects.
> >>  #
> >> +# The following objects are tested:
> >> +# - tables
> >> +# - views
> >> +# - stored procedures and functions
> >> +# - triggers
> >> +# - events
> >> +#
> >> +# It is also tested that backup and restore works correctly
> >> when there are
> >> +# dependencies between objects such as:
> >> +# - one table depends on another table via foreign key 
> constraint, # 
> >> +- a view depends on a table or another view, # - stored 
> routine or 
> >> +event refers to another object.
> >>  
> >>  --source include/have_innodb.inc
> >>  --source include/not_embedded.inc
> >>  
> >> -connect (con1,localhost,root,,);
> >> -connection con1;
> >> +# Note: BACKUP crashes server when InnoDB engine is used and
> >> views are
> >> +# included in the image (see BUG#34758). Until this issue is
> >> resolved we
> >> +# can't use InnoDB engine here and thus define table
> >> dependencies using
> >> +# foreign key constraints.
> >> +#
> >> +# If tables use MyISAM storage engine the test works but 
> all foreign 
> >> +# key constraints are silently ignored. When BUG#34758 is
> >> fixed, the line
> >> +# below should be changed to make InnoDB the default storage
> >> engine so that
> >> +# handling of dependencies introduced by foreign keys can
> >> also be tested.
> >> +#
> >> +# Note that there is a separate test backup_fkey testing 
> foreign key 
> >> +# constraints.
> >> +
> >> +SET storage_engine=MyISAM;
> > 
> > I understand the need to remove the InnoDB from the test, but I am 
> > surprised at the radical changes to the test. I would 
> request that the 
> > test not be stripped so drastically. Case in point, I would 
> ask that 
> > the parts that are removed are put back in put disabled (like 
> > commented out) and marked with the bug report. I think it is a big 
> > risk to take functionality out of a test with expectations 
> that it would be covered by another test or added later.
> > Better to leave it in and disable it as a constant reminder.
> > 
> > If you insist the radical changes are necessary, perhaps it 
> would help 
> > to explain some more as to why it was changed. But let's not remove 
> > anything from the test even if we have to disable it until 
> a bug report is fixed.
> > 
> > Now that I've thought more about it, my suggestion is as follows:
> > 
> > * return the backup objects test to its original form
> > * comment out the InnoDB and change to MyISAM with notes to 
> reenable 
> > it when the bug is fixed (noted in the bug report perhaps?).
> 
> This is already done - see above the long comment explaining 
> why MyISAM is used now and that it should be replaced by 
> InnoDB when bugs are fixed.
> 
> > * create a new test for checking dependencies
> > 
> 
> It would be easier to create a separate test but I decided 
> otherwise based on the following reasoning.
> 
>   The goal of backup_objects test is to check that our system 
> correctly backups
>   and restores all supported kinds of objects. Checking that 
> dependencies between
>   objects are correctly taken into account is logically part 
> of this task. A
>   separate test for testing dependencies would be in fact 
> very similar to
>   backup_objects test. Thus, instead of multiplying the 
> number of tests it makes
>   sense to add object dependency checking to backup_objects test.
> 
> Doing this that way required changing the test but I disagree 
> that changes are radical. It is still the same test which 
> creates coulple of objects of different kinds, backups them 
> and then restores. What is changed is the following:
> 
> - Two databases 'db1' and 'db2' used instead of single 
> 'bup_objects'. This is needed to create inter-database dependencies.
> 
> - Functions and procedures are created first in case view or 
> table definitions refer to them (currently only views).
> 
> - View definitions are more complex, so that we have views 
> which depend on other
>   views.
> 
> - Different data is loaded into tables to ensure that views 
> are not empty.
> 
> - The "Exercise the objects" section has been removed because 
> first, it has no influence on BACKUP/RESTORE operation and 
> second, it is hard to define objects with complex 
> dependencies and usable at the same time.
> 
> - To shorten the test, SHOW CREATE statements are executed 
> only after restore. 
> In case of wrong restore these will be compared to the ones 
> stored in the result file.
> 
> - Instead of selecting from I_S tables, sql statements such 
> as SHOW EVENTS and SHOW PROCEDURE STATUS are used to see if 
> events or procedures were restored.
> 
> I don't think these changes are drastic - I claim that after 
> the updates the test is a superset of the previous test. 
> Perhaps your impression was caused by the fact that object 
> definitions are moved around (stored routines are defined at 
> the beginning) which looks in the patch like a lot of lines 
> have been deleted. But note that equivalent lines are inserted later!
> 
> 
> >>  --remove_file $MYSQLTEST_VARDIR/master-data/bup_objects.bak
> > 
> > Remove file? I don't see this file in my repository. Is 
> this needed or 
> > is this a mistake?
> >
> 
> This line comes from the original test. One line above there is
> 
> --error 0,1
> 
> which means that --remove_file is here just in case the 
> bup_objects.bak is left over from some other test (safety).
> 
> >> diff -Nrup a/sql/backup/backup_info.cc b/sql/backup/backup_info.cc
> >> --- a/sql/backup/backup_info.cc	2008-04-24 11:18:47 +02:00
> >> +++ b/sql/backup/backup_info.cc	2008-05-07 15:07:25 +02:00
> >> @@ -223,6 +223,68 @@ uchar* Backup_info::Ts_hash_node::get_ke
> >>    return (uchar*)(n->name->ptr());
> >>  }
> >>  
> >> +/**
> >> +  Represents a node in the dependency list.
> >> +  
> >> +  Such node can be an empty placeholder or store a pointer
> >> to a catalogue item
> >> +  in @c obj member. Dependency list nodes are kept in a hash
> >> and thus
> >> +  @c Dep_node contains all required infrastructure: the @c
> >> key member to store
> >> +  a key string plus @c get_key() and @c free() functions
> >> used in @c HASH
> >> +  operations.
> >> + */
> >> +struct Backup_info::Dep_node: public Sql_alloc {
> >> +  Dep_node *next;
> >> +  Dbobj *obj;
> >> +  String key;
> >> +
> >> +  Dep_node(const ::String &db_name, const ::String &name);  
> >> + Dep_node(const Dep_node&);
> >> +
> >> +  static uchar* get_key(const uchar *record, size_t
> >> *key_length, my_bool);
> >> +  static void free(void *record);
> >> +};
> >> +
> >> +/**
> >> +  Create an empty dependency list node for a given
> >> per-database object.
> >> +
> >> +  The object is identified by its name and the name of the
> >> database to which
> >> +  it belongs. 
> >> + */
> > 
> > Questions: What about global objects? Let's suppose we 
> discover some 
> > global or at least non-database level dependencies. Having 
> a db_name 
> > as part of the key is not going to work. Would you add a 
> new subclass 
> > to handle these or should this class extension be designed 
> to rely on a non-semantic key?
> >
> 
> As you know we don't backup global objects right now, with 
> exception of table spaces and databases. The design described 
> in WL contains a full list of objects that we handle and 
> explains that for most of them dependencies are handled by 
> restoring them in a correct order. E.g., first global objects 
> are restored and only then per-database objects.
> 
> Because of that, the task reduces to handling per-database 
> object dependencies only. Here also objects are restored in 
> correct order and the only non-trivial issue to resolve is 
> handling of view-view dependencies.
> 
> The order in which global object metadata is stored in the 
> image, and thus the order in which these objects are 
> restored, is determined by Backup_info::Global_iterator. This 
> iterator first lists tablespaces and then databases, as 
> specified in the worklog design.
> 
> >> +inline
> >> +Backup_info::Dep_node::Dep_node(const ::String &db_name,
> >> const ::String &name)
> >> +  :next(NULL), obj(NULL)
> >> +{
> >> +  key.length(0);
> >> +  key.append(db_name);
> >> +  key.append(".");
> >> +  key.append(name);
> >> +}
> >> +
> >> +inline
> >> +Backup_info::Dep_node::Dep_node(const Dep_node &n)  
> :Sql_alloc(n), 
> >> +next(n.next), obj(n.obj) {
> >> +  key.copy(n.key);
> >> +}
> >> +
> >> +inline
> >> +void Backup_info::Dep_node::free(void *record) {
> >> +  ((Dep_node*)record)->~Dep_node();
> >> +}
> >> +
> >> +inline
> >> +uchar*
> >> +Backup_info::Dep_node::get_key(const uchar *record,
> >> +                              size_t *key_length,
> >> +                              my_bool) // not_used
> >> __attribute__((unused)))
> >> +{
> >> +  Dep_node *n= (Dep_node*)record;
> >> +  if (key_length)
> >> +    *key_length= n->key.length();
> >> +  return (uchar*)n->key.ptr();
> >> +}
> >>  
> > 
> > What about case sensitivity issues? Will the code handle the case 
> > where on Windows 'testdb1' and 'TESTDB1' are the same? 
> Likewise, what 
> > about Linux where they are different? This code needs to 
> handle case sensitivity issues.
> > 
> 
> Good point, not easy to spot from non-Win perspective! I 
> think this issue should be taken into account in the object 
> services code. Recall that if, for example, 
>     we want to look at server's databases, then we first 
> create an iterator with obs::get_databases(). Then we get 
> obs::Obj instances from this iterator and can read database 
> name using obs::Obj::get_name(). It is up to object services 
> implementation to decide what exactly get_name() returns, but 
> it should be a "canonical" name of the object by which it can 
> be identified inside the server. 
> On systems where several forms of the same name can exists, 
> the implementation should pick a canonical form (for example 
> all uppercase).
> 
> Going back to the code, Dep_node structures are used in 
> dep_hash to store entries for objects returned by object 
> services API functions. When object obj is stored there, 
> obj->get_db_name() and obj->get_name() are used to initialize 
> the hash entry and thus form the key.
> 
> Thus everything should be fine under assumption that if obj1 
> and obj2 represent the same object then obj1->get_db_name() 
> and obj1->get_name() are the same as 
> obj2->get_db_name() and obj2->get_name(). I think it is a valid 
> obj2->assumption to
> make. If this assumption is violated, as when 
> obj1->get_name() returns "testdb" 
> and obj2->get_name() returns "TESTDB", then it should be 
> reported as a bug in si_objects, but I don't think it is the case.
> 
> >>  
> >>  /**
> >> @@ -233,7 +295,9 @@ uchar* Backup_info::Ts_hash_node::get_ke
> >>    @c find_backup_engine().
> >>   */
> >>  Backup_info::Backup_info(Backup_restore_ctx &ctx)
> >> -  :m_ctx(ctx), m_state(Backup_info::ERROR), native_snapshots(8)
> >> +  :m_ctx(ctx), m_state(Backup_info::ERROR), native_snapshots(8),
> >> +   m_dep_list(NULL), m_dep_end(NULL), 
> >> +   m_sr_end(NULL), m_v_end(NULL), m_tr_end(NULL)
> >>  {
> >>    using namespace backup;
> >>  
> >> @@ -243,6 +307,8 @@ Backup_info::Backup_info(Backup_restore_
> >>  
> >>    hash_init(&ts_hash, &::my_charset_bin, 16, 0, 0,
> >>              Ts_hash_node::get_key, Ts_hash_node::free, MYF(0));
> >> +  hash_init(&dep_hash, &::my_charset_bin, 16, 0, 0,
> >> +            Dep_node::get_key, Dep_node::free, MYF(0));
> >>  
> >>    /* 
> >>      Create default and CS snapshot objects and add them to the 
> >> snapshots list.
> >> @@ -263,7 +329,7 @@ Backup_info::Backup_info(Backup_restore_
> >>      return;
> >>  
> >>    snapshots.push_back(snap);
> >> -
> >> +
> >>    m_state= CREATED;
> >>  }
> >>  
> >> @@ -282,6 +348,7 @@ Backup_info::~Backup_info()
> >>      delete snap;
> >>  
> >>    hash_free(&ts_hash);
> >> +  hash_free(&dep_hash);
> >>  }
> >>  
> >>  /**
> >> @@ -770,6 +837,112 @@ backup::Image_info::Table* Backup_info::
> >>  }
> >>  
> >>  /**
> >> +  For all views on which the given one depends directly or
> >> indirectly, add
> >> +  placeholders to the dependency list ensuring correct order
> >> among them.
> >> +
> >> +  The main view being processed in this function should be
> >> added to the
> >> +  dependency list after all the placeholders created here. 
> >> This way if one of
> >> +  the views on which the given one depends is added to the
> >> list later, it will
> >> +  be placed at the correct position indicated by the placeholder.
> >> +
> >> +  @param[in]  obj   server object for the view to be processed
> >> +
> >> +  A recursive algorithm is used to insert placeholders in
> >> correct order.
> >> +  That is, for each base view @c bv of the given one, @c
> >> add_view_deps(bv)
> >> +  is called to insert all dependencies of @c bv before @c bv
> >> itself is appended
> >> +  to the list.
> >> +  
> >> +  Function @c get_dep_node() used to create a node to be
> >> inserted into the
> >> +  dependency list detects if the node was already created
> >> earlier. This ensures
> >> +  correct behaviour of the algorithm even if the same view
> >> is visited several
> >> +  times during the depth-first walk of the dependency graph
> >> performed by the
> >> +  recursive algorithm. If it is detected that a node for a
> >> given view already
> >> +  exists then this view is not processed for the second time.  
> >> +
> >> +  This also ensures termination of the algorithm even 
> when there are  
> >> + circular dependencies. Suppose that view @c v has itself
> >> as an (indirect)
> >> +  dependency. When processing @c v, a node will be created
> >> for it first
> >> +  and then its dependencies will be processed. When
> >> add_view_deps() comes across
> >> +  @c v for the second time it will see that a corresponding
> >> mode already
> >> +  exists and thus will break the recursion.
> >> +
> >> +  @return Non-zero value if an error happened.
> >> + */
> > 
> > The above is very good documentation. I am impressed! :) Can we put 
> > some of this in the worklog so that it gets into the docs 
> internal or otherwise?
> > 
> 
> Thanks! :) I have added more details to de description of the 
> algorithm in the worklog. Now it reads:
> 
> Before inserting a view into the catalogue, slots will be 
> created for all the views on which it depends using the 
> following recursive procedure.
> 
> add_view_deps(v)=
>    for every base view v' of v do
>      add_view_deps(v')
>    append slot for v to the dependency list.
> 
> This ensures a correct order between slots for all the views 
> on which a given one depends.
> 
> Actually the algorithm is a bit more complicated to ensure 
> correct termination even when there are cyclic dependencies. 
> To that end, set A of processed views is maintained. If a 
> view is in this set then it is not processed again and 
> recursion terminates:
> 
> add_view_deps(v)=
>    if A contains v then return else add v to A.
>    for every base view v' of v do
>      add_view_deps(v')
>    append slot for v to the dependency list.
> 
> Thus even if view v has itself as an (indirect) dependency 
> the algorithm will terminate. When processing v for the first 
> time it will be added to A and then, when add_view_deps() 
> comes across v for the second time it will return immediately.
> 
> 
> > 
> > ...
> > 
> >> +/*
> >> +  Find or create a dependency list node for an object with a
> >> given name.
> >> +  
> >> +  @param[in]  db_name  name of the database to which this
> >> object belongs
> >> +  @param[in]  name     name of the object
> >> +  @param[out] node     pointer to the created or located node
> >> +  
> >> +  All nodes created using that function are keept inside @c
> >> dep_node HASH
> > 
> > What function? What does 'that' refer to? 
> > /s/keept/kept
> > 
> > 
> 
> The function being documented here. I think I should use 
> 'this' instead.
> 
> > ...
> > 
> >> +    Set end to point at m_sr_end, m_et_end or m_v_end
> >> depending on the
> > 
> > What does _et_ _sr_ and _v_ mean? Can you please spend a few 
> > characters and use something more descriptive like m_view_end? It 
> > would make the code easier to maintain and 'cost' nothing.
> > 
> 
> Ok, changed to m_srout_end, m_view_end and m_trigger_end.
> 
> >> +    type of the item.
> >> +    
> >> +    If the corresponding section is empty, the *end pointer
> >> is set up to point
> >> +    at the node after which this section should start, or to
> >> NULL if section
> >> +    should be at the beginning of the dependency list.
> >> +    
> >> +    After inserting the new node it becomes the last node in
> >> the section so
> >> +    the pointer is updated to point at it.
> >> +   */
> >> +
> > 
> > ...
> > 
> >> +  enum { TABLESPACES, DATABASES, DONE } mode;
> > 
> > What about users and configuration items? While they are 
> unlikely to 
> > have dependencies, what if we discover a global item/object 
> that does 
> > have dependencies? Would it be easy to add them later? If 
> not, please 
> > place some comments to indicate how we could add additional 
> global objects.
> >
> 
> 1. We don't backup any global objects except databases and 
> tablespaces.
> 2. Possible dependencies have been analysed during the design 
> phase and are taken into account in the design which is 
> approved and implemented here. Do you think that design is 
> not correct?
> 
> Rafal

Thread
bk commit into 6.0 tree (rafal:1.2614) WL#4211rsomla7 May
  • Re: bk commit into 6.0 tree (rafal:1.2614) WL#4211Jonas Oreland7 May
  • RE: bk commit into 6.0 tree (rafal:1.2614) WL#4211Chuck Bell13 May
    • Re: bk commit into 6.0 tree (rafal:1.2614) WL#4211Rafal Somla13 May
      • RE: bk commit into 6.0 tree (rafal:1.2614) WL#4211Chuck Bell14 May