Rafal,
I'm ok with your responses and changes. Patch approved.
Chuck
> -----Original Message-----
> From: Rafal Somla [mailto:rsomla@stripped]
> Sent: Tuesday, May 13, 2008 6:55 AM
> To: Chuck Bell
> Cc: commits@stripped
> Subject: Re: bk commit into 6.0 tree (rafal:1.2614) WL#4211
>
> Hi Chuck,
>
> Thanks for your review! See my replies below.
>
> Rafal
>
> Chuck Bell wrote:
> > Rafal,
> >
> > I have some questions and suggestions:
> >
> >> diff -Nrup a/mysql-test/t/backup_objects.test
> >> b/mysql-test/t/backup_objects.test
> >> --- a/mysql-test/t/backup_objects.test 2008-02-13
> >> 12:40:24 +01:00
> >> +++ b/mysql-test/t/backup_objects.test 2008-05-07
> >> 15:07:25 +02:00
> >> @@ -1,12 +1,36 @@
> >> #
> >> -# This test testing backup and restore for database objects.
> >> +# This test is testing backup and restore of database objects.
> >> #
> >> +# The following objects are tested:
> >> +# - tables
> >> +# - views
> >> +# - stored procedures and functions
> >> +# - triggers
> >> +# - events
> >> +#
> >> +# It is also tested that backup and restore works correctly
> >> when there are
> >> +# dependencies between objects such as:
> >> +# - one table depends on another table via foreign key
> constraint, #
> >> +- a view depends on a table or another view, # - stored
> routine or
> >> +event refers to another object.
> >>
> >> --source include/have_innodb.inc
> >> --source include/not_embedded.inc
> >>
> >> -connect (con1,localhost,root,,);
> >> -connection con1;
> >> +# Note: BACKUP crashes server when InnoDB engine is used and
> >> views are
> >> +# included in the image (see BUG#34758). Until this issue is
> >> resolved we
> >> +# can't use InnoDB engine here and thus define table
> >> dependencies using
> >> +# foreign key constraints.
> >> +#
> >> +# If tables use MyISAM storage engine the test works but
> all foreign
> >> +# key constraints are silently ignored. When BUG#34758 is
> >> fixed, the line
> >> +# below should be changed to make InnoDB the default storage
> >> engine so that
> >> +# handling of dependencies introduced by foreign keys can
> >> also be tested.
> >> +#
> >> +# Note that there is a separate test backup_fkey testing
> foreign key
> >> +# constraints.
> >> +
> >> +SET storage_engine=MyISAM;
> >
> > I understand the need to remove the InnoDB from the test, but I am
> > surprised at the radical changes to the test. I would
> request that the
> > test not be stripped so drastically. Case in point, I would
> ask that
> > the parts that are removed are put back in put disabled (like
> > commented out) and marked with the bug report. I think it is a big
> > risk to take functionality out of a test with expectations
> that it would be covered by another test or added later.
> > Better to leave it in and disable it as a constant reminder.
> >
> > If you insist the radical changes are necessary, perhaps it
> would help
> > to explain some more as to why it was changed. But let's not remove
> > anything from the test even if we have to disable it until
> a bug report is fixed.
> >
> > Now that I've thought more about it, my suggestion is as follows:
> >
> > * return the backup objects test to its original form
> > * comment out the InnoDB and change to MyISAM with notes to
> reenable
> > it when the bug is fixed (noted in the bug report perhaps?).
>
> This is already done - see above the long comment explaining
> why MyISAM is used now and that it should be replaced by
> InnoDB when bugs are fixed.
>
> > * create a new test for checking dependencies
> >
>
> It would be easier to create a separate test but I decided
> otherwise based on the following reasoning.
>
> The goal of backup_objects test is to check that our system
> correctly backups
> and restores all supported kinds of objects. Checking that
> dependencies between
> objects are correctly taken into account is logically part
> of this task. A
> separate test for testing dependencies would be in fact
> very similar to
> backup_objects test. Thus, instead of multiplying the
> number of tests it makes
> sense to add object dependency checking to backup_objects test.
>
> Doing this that way required changing the test but I disagree
> that changes are radical. It is still the same test which
> creates coulple of objects of different kinds, backups them
> and then restores. What is changed is the following:
>
> - Two databases 'db1' and 'db2' used instead of single
> 'bup_objects'. This is needed to create inter-database dependencies.
>
> - Functions and procedures are created first in case view or
> table definitions refer to them (currently only views).
>
> - View definitions are more complex, so that we have views
> which depend on other
> views.
>
> - Different data is loaded into tables to ensure that views
> are not empty.
>
> - The "Exercise the objects" section has been removed because
> first, it has no influence on BACKUP/RESTORE operation and
> second, it is hard to define objects with complex
> dependencies and usable at the same time.
>
> - To shorten the test, SHOW CREATE statements are executed
> only after restore.
> In case of wrong restore these will be compared to the ones
> stored in the result file.
>
> - Instead of selecting from I_S tables, sql statements such
> as SHOW EVENTS and SHOW PROCEDURE STATUS are used to see if
> events or procedures were restored.
>
> I don't think these changes are drastic - I claim that after
> the updates the test is a superset of the previous test.
> Perhaps your impression was caused by the fact that object
> definitions are moved around (stored routines are defined at
> the beginning) which looks in the patch like a lot of lines
> have been deleted. But note that equivalent lines are inserted later!
>
>
> >> --remove_file $MYSQLTEST_VARDIR/master-data/bup_objects.bak
> >
> > Remove file? I don't see this file in my repository. Is
> this needed or
> > is this a mistake?
> >
>
> This line comes from the original test. One line above there is
>
> --error 0,1
>
> which means that --remove_file is here just in case the
> bup_objects.bak is left over from some other test (safety).
>
> >> diff -Nrup a/sql/backup/backup_info.cc b/sql/backup/backup_info.cc
> >> --- a/sql/backup/backup_info.cc 2008-04-24 11:18:47 +02:00
> >> +++ b/sql/backup/backup_info.cc 2008-05-07 15:07:25 +02:00
> >> @@ -223,6 +223,68 @@ uchar* Backup_info::Ts_hash_node::get_ke
> >> return (uchar*)(n->name->ptr());
> >> }
> >>
> >> +/**
> >> + Represents a node in the dependency list.
> >> +
> >> + Such node can be an empty placeholder or store a pointer
> >> to a catalogue item
> >> + in @c obj member. Dependency list nodes are kept in a hash
> >> and thus
> >> + @c Dep_node contains all required infrastructure: the @c
> >> key member to store
> >> + a key string plus @c get_key() and @c free() functions
> >> used in @c HASH
> >> + operations.
> >> + */
> >> +struct Backup_info::Dep_node: public Sql_alloc {
> >> + Dep_node *next;
> >> + Dbobj *obj;
> >> + String key;
> >> +
> >> + Dep_node(const ::String &db_name, const ::String &name);
> >> + Dep_node(const Dep_node&);
> >> +
> >> + static uchar* get_key(const uchar *record, size_t
> >> *key_length, my_bool);
> >> + static void free(void *record);
> >> +};
> >> +
> >> +/**
> >> + Create an empty dependency list node for a given
> >> per-database object.
> >> +
> >> + The object is identified by its name and the name of the
> >> database to which
> >> + it belongs.
> >> + */
> >
> > Questions: What about global objects? Let's suppose we
> discover some
> > global or at least non-database level dependencies. Having
> a db_name
> > as part of the key is not going to work. Would you add a
> new subclass
> > to handle these or should this class extension be designed
> to rely on a non-semantic key?
> >
>
> As you know we don't backup global objects right now, with
> exception of table spaces and databases. The design described
> in WL contains a full list of objects that we handle and
> explains that for most of them dependencies are handled by
> restoring them in a correct order. E.g., first global objects
> are restored and only then per-database objects.
>
> Because of that, the task reduces to handling per-database
> object dependencies only. Here also objects are restored in
> correct order and the only non-trivial issue to resolve is
> handling of view-view dependencies.
>
> The order in which global object metadata is stored in the
> image, and thus the order in which these objects are
> restored, is determined by Backup_info::Global_iterator. This
> iterator first lists tablespaces and then databases, as
> specified in the worklog design.
>
> >> +inline
> >> +Backup_info::Dep_node::Dep_node(const ::String &db_name,
> >> const ::String &name)
> >> + :next(NULL), obj(NULL)
> >> +{
> >> + key.length(0);
> >> + key.append(db_name);
> >> + key.append(".");
> >> + key.append(name);
> >> +}
> >> +
> >> +inline
> >> +Backup_info::Dep_node::Dep_node(const Dep_node &n)
> :Sql_alloc(n),
> >> +next(n.next), obj(n.obj) {
> >> + key.copy(n.key);
> >> +}
> >> +
> >> +inline
> >> +void Backup_info::Dep_node::free(void *record) {
> >> + ((Dep_node*)record)->~Dep_node();
> >> +}
> >> +
> >> +inline
> >> +uchar*
> >> +Backup_info::Dep_node::get_key(const uchar *record,
> >> + size_t *key_length,
> >> + my_bool) // not_used
> >> __attribute__((unused)))
> >> +{
> >> + Dep_node *n= (Dep_node*)record;
> >> + if (key_length)
> >> + *key_length= n->key.length();
> >> + return (uchar*)n->key.ptr();
> >> +}
> >>
> >
> > What about case sensitivity issues? Will the code handle the case
> > where on Windows 'testdb1' and 'TESTDB1' are the same?
> Likewise, what
> > about Linux where they are different? This code needs to
> handle case sensitivity issues.
> >
>
> Good point, not easy to spot from non-Win perspective! I
> think this issue should be taken into account in the object
> services code. Recall that if, for example,
> we want to look at server's databases, then we first
> create an iterator with obs::get_databases(). Then we get
> obs::Obj instances from this iterator and can read database
> name using obs::Obj::get_name(). It is up to object services
> implementation to decide what exactly get_name() returns, but
> it should be a "canonical" name of the object by which it can
> be identified inside the server.
> On systems where several forms of the same name can exists,
> the implementation should pick a canonical form (for example
> all uppercase).
>
> Going back to the code, Dep_node structures are used in
> dep_hash to store entries for objects returned by object
> services API functions. When object obj is stored there,
> obj->get_db_name() and obj->get_name() are used to initialize
> the hash entry and thus form the key.
>
> Thus everything should be fine under assumption that if obj1
> and obj2 represent the same object then obj1->get_db_name()
> and obj1->get_name() are the same as
> obj2->get_db_name() and obj2->get_name(). I think it is a valid
> obj2->assumption to
> make. If this assumption is violated, as when
> obj1->get_name() returns "testdb"
> and obj2->get_name() returns "TESTDB", then it should be
> reported as a bug in si_objects, but I don't think it is the case.
>
> >>
> >> /**
> >> @@ -233,7 +295,9 @@ uchar* Backup_info::Ts_hash_node::get_ke
> >> @c find_backup_engine().
> >> */
> >> Backup_info::Backup_info(Backup_restore_ctx &ctx)
> >> - :m_ctx(ctx), m_state(Backup_info::ERROR), native_snapshots(8)
> >> + :m_ctx(ctx), m_state(Backup_info::ERROR), native_snapshots(8),
> >> + m_dep_list(NULL), m_dep_end(NULL),
> >> + m_sr_end(NULL), m_v_end(NULL), m_tr_end(NULL)
> >> {
> >> using namespace backup;
> >>
> >> @@ -243,6 +307,8 @@ Backup_info::Backup_info(Backup_restore_
> >>
> >> hash_init(&ts_hash, &::my_charset_bin, 16, 0, 0,
> >> Ts_hash_node::get_key, Ts_hash_node::free, MYF(0));
> >> + hash_init(&dep_hash, &::my_charset_bin, 16, 0, 0,
> >> + Dep_node::get_key, Dep_node::free, MYF(0));
> >>
> >> /*
> >> Create default and CS snapshot objects and add them to the
> >> snapshots list.
> >> @@ -263,7 +329,7 @@ Backup_info::Backup_info(Backup_restore_
> >> return;
> >>
> >> snapshots.push_back(snap);
> >> -
> >> +
> >> m_state= CREATED;
> >> }
> >>
> >> @@ -282,6 +348,7 @@ Backup_info::~Backup_info()
> >> delete snap;
> >>
> >> hash_free(&ts_hash);
> >> + hash_free(&dep_hash);
> >> }
> >>
> >> /**
> >> @@ -770,6 +837,112 @@ backup::Image_info::Table* Backup_info::
> >> }
> >>
> >> /**
> >> + For all views on which the given one depends directly or
> >> indirectly, add
> >> + placeholders to the dependency list ensuring correct order
> >> among them.
> >> +
> >> + The main view being processed in this function should be
> >> added to the
> >> + dependency list after all the placeholders created here.
> >> This way if one of
> >> + the views on which the given one depends is added to the
> >> list later, it will
> >> + be placed at the correct position indicated by the placeholder.
> >> +
> >> + @param[in] obj server object for the view to be processed
> >> +
> >> + A recursive algorithm is used to insert placeholders in
> >> correct order.
> >> + That is, for each base view @c bv of the given one, @c
> >> add_view_deps(bv)
> >> + is called to insert all dependencies of @c bv before @c bv
> >> itself is appended
> >> + to the list.
> >> +
> >> + Function @c get_dep_node() used to create a node to be
> >> inserted into the
> >> + dependency list detects if the node was already created
> >> earlier. This ensures
> >> + correct behaviour of the algorithm even if the same view
> >> is visited several
> >> + times during the depth-first walk of the dependency graph
> >> performed by the
> >> + recursive algorithm. If it is detected that a node for a
> >> given view already
> >> + exists then this view is not processed for the second time.
> >> +
> >> + This also ensures termination of the algorithm even
> when there are
> >> + circular dependencies. Suppose that view @c v has itself
> >> as an (indirect)
> >> + dependency. When processing @c v, a node will be created
> >> for it first
> >> + and then its dependencies will be processed. When
> >> add_view_deps() comes across
> >> + @c v for the second time it will see that a corresponding
> >> mode already
> >> + exists and thus will break the recursion.
> >> +
> >> + @return Non-zero value if an error happened.
> >> + */
> >
> > The above is very good documentation. I am impressed! :) Can we put
> > some of this in the worklog so that it gets into the docs
> internal or otherwise?
> >
>
> Thanks! :) I have added more details to de description of the
> algorithm in the worklog. Now it reads:
>
> Before inserting a view into the catalogue, slots will be
> created for all the views on which it depends using the
> following recursive procedure.
>
> add_view_deps(v)=
> for every base view v' of v do
> add_view_deps(v')
> append slot for v to the dependency list.
>
> This ensures a correct order between slots for all the views
> on which a given one depends.
>
> Actually the algorithm is a bit more complicated to ensure
> correct termination even when there are cyclic dependencies.
> To that end, set A of processed views is maintained. If a
> view is in this set then it is not processed again and
> recursion terminates:
>
> add_view_deps(v)=
> if A contains v then return else add v to A.
> for every base view v' of v do
> add_view_deps(v')
> append slot for v to the dependency list.
>
> Thus even if view v has itself as an (indirect) dependency
> the algorithm will terminate. When processing v for the first
> time it will be added to A and then, when add_view_deps()
> comes across v for the second time it will return immediately.
>
>
> >
> > ...
> >
> >> +/*
> >> + Find or create a dependency list node for an object with a
> >> given name.
> >> +
> >> + @param[in] db_name name of the database to which this
> >> object belongs
> >> + @param[in] name name of the object
> >> + @param[out] node pointer to the created or located node
> >> +
> >> + All nodes created using that function are keept inside @c
> >> dep_node HASH
> >
> > What function? What does 'that' refer to?
> > /s/keept/kept
> >
> >
>
> The function being documented here. I think I should use
> 'this' instead.
>
> > ...
> >
> >> + Set end to point at m_sr_end, m_et_end or m_v_end
> >> depending on the
> >
> > What does _et_ _sr_ and _v_ mean? Can you please spend a few
> > characters and use something more descriptive like m_view_end? It
> > would make the code easier to maintain and 'cost' nothing.
> >
>
> Ok, changed to m_srout_end, m_view_end and m_trigger_end.
>
> >> + type of the item.
> >> +
> >> + If the corresponding section is empty, the *end pointer
> >> is set up to point
> >> + at the node after which this section should start, or to
> >> NULL if section
> >> + should be at the beginning of the dependency list.
> >> +
> >> + After inserting the new node it becomes the last node in
> >> the section so
> >> + the pointer is updated to point at it.
> >> + */
> >> +
> >
> > ...
> >
> >> + enum { TABLESPACES, DATABASES, DONE } mode;
> >
> > What about users and configuration items? While they are
> unlikely to
> > have dependencies, what if we discover a global item/object
> that does
> > have dependencies? Would it be easy to add them later? If
> not, please
> > place some comments to indicate how we could add additional
> global objects.
> >
>
> 1. We don't backup any global objects except databases and
> tablespaces.
> 2. Possible dependencies have been analysed during the design
> phase and are taken into account in the design which is
> approved and implemented here. Do you think that design is
> not correct?
>
> Rafal