List:Commits« Previous MessageNext Message »
From:Ingo Strüwing Date:December 9 2009 11:56am
Subject:Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)
Bug#43596
View as plain text  
Hi Rafal,

STATUS: Approved pending changes

REQUIRED
11) Fix function comment (recurring)

REQUESTS:
1) Better comment (recurring)
2) Typos (recurring)
3) Add backup_cs_names_lctn0
4) Add SHOW VARIABLES LIKE 'lower_case_table_names'
5) Consider Mac
7a+7b) Copy files to var/run. Copy manually to std_data.
8) Use NAME_LEN
10) Please put in #ifndef DBUG_OFF.

SUGGESTIONS:
6) Upper case 'bdir'
9) Pass 'obj'

QUESTIONS out of curiosity, can be answered after push
Q) (recurring)

DETAILS:
Rafal Somla, 04.12.2009 16:14:
...

>  2903 Rafal Somla	2009-12-04
>       Bug #43596 Restore fails on case ins. server for databases who's names differ 
>                  in case only.
>       
>       On a case sensitive system such as linux, it is possible to create two 
>       objects whose names differ in letter case only. This is true for databases, 
>       tables, views and triggers. When such objects are backed up and then 
>       restored on a case insensitive system, they are treated as a single object 
>       which is incorrect and leads to many problems.
>       
>       This patch modifies RESTORE so that it aborts operation if two objects with 
>       equivalent names are found in the backup image. Each time a new object is 
>       added to the catalogue (in bcat_add_item() function) a check is made for 
>       possible duplicates. This check is implemented in Image_info::add_X() 
>       methods.
>       
>       For the duplicates check, a new obs::Obj::is_my_name() method is used. It 
>       checks if given name refers to the given object. The semantics of this 
>       method depends on the system on which it is run (case sensitive vs. case 
>       insensitive) and server options setting (such as --lower-case-table-names).
>       
>       To use si_objects for handling object name issues, the code had to be 
>       rearranged considerably. Basically, obs::Obj instance is created early in 
>       the restore process, in bcat_add_item() function, when image catalogue is 
>       read. Then this obs::Obj instance is used to refer to the object instead of 
>       object's name.


1) This description confuses me. I thought, an object is an instance of
a class. But you distinguish between object and instance. If an obs::Obj
instance refers to the object, what is the object then?

Or do you want to say: "Then a *reference* to this obs::Obj instance is
used to identify the instance instead of its name"?

>                      A pointer to the instance is stored in each catalogue entry 
>       upon that entry creation (before, pointer was set later, when the 
>       serialization string of the object was available).
>       
>       Since we want to create obs::Obj instances before serialization data is 
>       available, the obs::get_X() functions are changed so that they can create 
>       instances when only object name is known. Later, these instances will be 
>       updated with the information from serialization string using new 
>       obs::Obj::init_from_image() method. After that, object can be created with 
>       obs::Obj::create() method as before.


1) Again confused. Do I guess correctly that "object" in "object an be
created" means the database object this time, not the obs::Obj instance?

...
>      @ sql/backup/image_info.h
>         - Add find_ts() function for finding table spaces in the catalogue.
>         - Change signatures of add_X() methods so that now they accept obs::Obj 
>           parameter.
>         - Remove materialize() method which is no longer needed - now obs::Obj 
>           instances are required when entries are created in the catalogue.


Q) I always had the impression that materialize() is the reverse
operation to serialize(). And that they refer to database object's meta
data. So the result of serialize is a "string", which describes the meta
data of a database object. And materialize() would take that string to
make suc database object "material" in a database. But if materialize()
is no longer needed, it must have been something else. What is the
counterpart of serialize() then?

>         - Image_info::Obj() constructor accepts obs::Obj parameter, because each 
>           catalogue entry must point at obs::Obj instance.
>         - Add initialization method Image_info::Obj::store_info() which fills the 
>           basic st_bstream_info members. Object's name is read from obs::Obj 
>           instance (and thus is normalized).
>         - Make Image_info::Ts inherit from Db_ref, so that code for constructing
>           object name can be reused.
>         - Add Image_info::Db::find_table/object() mehtods.


2) Typo: mehtods -> methods

>         - Re-use Db_ref::describe() method for Ts and Db classes - avoids code 
>           duplication.
>      @ sql/backup/kernel.cc
>         - For each object added to the catalogue, first an obs::Obj instance must be
>           created using obs::get_X() functions. If this fails, error is reported.
> Once 
>           we have obs::Obj instance, it can be added to the catalogue with add_X() 
>           method. After that the catalogue assumes ownership of that object and will
> 
>           delete it.
>         - When reading metadata, obs::Obj instances stored in catalogue entries are 
>           intialized with this data and then objects are created.


2) Typo: intialized -> initialized

...
>      @ sql/si_objects.h
>         - Add Obj::is_my_name() method.
>         - Add Obj::init_from_image() method.
>         - Change signature of get_X() functions - now they dont take serialization
>           string as a parameter.


2) Typo: dont -> don't

> 
>     added:
>       mysql-test/suite/backup/r/backup_cs_names.result
>       mysql-test/suite/backup/r/backup_cs_names_lctn1.result
>       mysql-test/suite/backup/r/backup_cs_names_lctn2.result
>       mysql-test/suite/backup/std_data/cs_names_d.bkp
>       mysql-test/suite/backup/std_data/cs_names_r.bkp
>       mysql-test/suite/backup/std_data/cs_names_t.bkp
>       mysql-test/suite/backup/std_data/cs_names_v.bkp
>       mysql-test/suite/backup/t/backup_cs_names-master.opt
>       mysql-test/suite/backup/t/backup_cs_names.test
>       mysql-test/suite/backup/t/backup_cs_names_lctn1-master.opt
>       mysql-test/suite/backup/t/backup_cs_names_lctn1.test
>       mysql-test/suite/backup/t/backup_cs_names_lctn2-master.opt
>       mysql-test/suite/backup/t/backup_cs_names_lctn2.test


3) So we do not have backup_cs_names_lctn0? I understand that name
conflicts should not happen there. But it would be nice to see that the
duplicates check does not fire on lctn0 too. Just in case that some
later patch breaks things.

...
> === added file 'mysql-test/suite/backup/r/backup_cs_names.result'
> --- a/mysql-test/suite/backup/r/backup_cs_names.result	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/backup/r/backup_cs_names.result	2009-12-04 15:14:43 +0000
> @@ -0,0 +1,57 @@
> +CALL mtr.add_suppression("was disabled during restore");
> +#


4) This runs with lctn0, right? Won't it be nice to SHOW VARIABLES LIKE
'lower_case_table_names'?

...
> +# Creating cs_names_d.bkp image.
> +CREATE DATABASE dba;
> +CREATE TABLE  dba.t1 (b int) COMMENT='in other database';
> +BACKUP DATABASE dbA,dba TO 'cs_names_d.bkp';


Q) So the possible problem is that t1 could be duplicated if both
databases are the same on the target system?

...
> === added file 'mysql-test/suite/backup/t/backup_cs_names-master.opt'
> --- a/mysql-test/suite/backup/t/backup_cs_names-master.opt	1970-01-01 00:00:00 +0000
> +++ b/mysql-test/suite/backup/t/backup_cs_names-master.opt	2009-12-04 15:14:43 +0000
> @@ -0,0 +1 @@
> +--lower-case-table-names=0


5) Are you aware that this is not safe? Maybe, you ask for not_windows
at test begin. But what about Mac? Is it able to run with lctn0? Won't
it silently  change to lctn2?

I tested for Linux that it changes an attempt to set lctn2 to lctn0.
Silently. Just a warning appears in error log. Aha! You detect this by
using have_case_insensitive_file_system. Good. So just the Mac case
might be open. Perhaps have_case_sensitive_file_system might help in
this case?

> 
> === added file 'mysql-test/suite/backup/t/backup_cs_names.test'
...
> +let $IMGDIR= suite/backup/std_data;
> +let $bdir=`select @@backupdir`;


6) Any reason, to have one in upper case and one in lower?

...

> +#
> +# Add objects with letter '��' which is equivalent to 'A' according to case 
> +# insensitive collations. This is to see that it would not cause any name 
> +# clashes.
> +#
> +
> +CREATE TABLE     `t��` (b int) COMMENT='upper case accented name';
> +CREATE VIEW      `v��` AS SELECT 3;
> +CREATE TRIGGER   `r��` AFTER DELETE ON `t��` FOR EACH ROW SET @baz=3;
> +CREATE EVENT     `e��` ON SCHEDULE EVERY 1 YEAR DO SET @baz=3;
> +CREATE PROCEDURE `p��` (b int) SET @baz=3;
> +CREATE FUNCTION  `f��` (b int) RETURNS int RETURN b;


(unnumbered, informational comment) Funny. In the email, I see two
inverted question marks. A hex dump shows eight bytes, which don't seem
to be a valid UTF-8 sequence. However, after merging the bundle I found
character 'A' with two dots above it. A hex dump shows a two byte sequence.

> +
> +--echo #
> +--echo # Backup database without potential name clashes.
> +--echo #
> +BACKUP DATABASE dbA TO 'dba.bkp';
> +
> +--echo 
> +--echo # Creating cs_names_d.bkp image.
> +--error 0,1
> +--remove_file $IMGDIR/cs_names_d.bkp


7a) This removes a versioned file. If one of the next three statements
fails, it won't be replaced by a new file. Committing in such situation
can be difficult. Is it intentional to block any commit before this test
passed? If you move remove_file down to right before copy_file, the
probability of a vanished versioned file will decrease.

> +CREATE DATABASE dba;
> +CREATE TABLE  dba.t1 (b int) COMMENT='in other database';
> +BACKUP DATABASE dbA,dba TO 'cs_names_d.bkp';
> +--copy_file $bdir/cs_names_d.bkp $IMGDIR/cs_names_d.bkp


7b) Copying these files automatically with every test run may be
comfortable, but I fear bzr will find these as modified files. The next
commit attempt after each test will present them for commit.

Also, due to the unpredictable order of execution for the test cases,
the other test cases, which use the images, may run before this one.
They may either not have the images, or use old images. Thus, the
comfort may not take effect in all circumstances.

The approach, I took in the cross platform tests, was to copy the images
to mysql-test/var/run. From there one can copy them manually, whenever a
new file format comes up. (This directory has the advantage that it is
part of the downloadable pushbuild archives and not cleared at begin of
every test, but this isn't required for these tests, I think.)

> +
> +--echo 
> +--echo # Creating cs_names_t.bkp image.
> +--error 0,1
> +--remove_file $IMGDIR/cs_names_t.bkp
> +USE dbA;
> +CREATE TABLE ta (c char) COMMENT='lower case name';
> +BACKUP DATABASE dbA TO 'cs_names_t.bkp';
> +--copy_file $bdir/cs_names_t.bkp $IMGDIR/cs_names_t.bkp
> +
> +--echo 
> +--echo # Creating cs_names_v.bkp image.
> +--error 0,1
> +--remove_file $IMGDIR/cs_names_v.bkp
> +# Restore original database without name clashes.
> +RESTORE FROM 'dba.bkp' OVERWRITE;
> +USE dbA;


Q) I guess, the RESTORE step is here to get rid of the 'ta' table? Do
you think it's more efficient, or more safe than to DROP 'ta'? Or is it
another test for successful RESTORE, which you repeat below?

...
> +# Restore original database without name clashes.
> +RESTORE FROM 'dba.bkp' OVERWRITE;
> +USE dbA;
> +CREATE TRIGGER ra BEFORE DELETE ON tA FOR EACH ROW SET @bar=2;
> +BACKUP DATABASE dbA TO 'cs_names_r.bkp';
> +--copy_file $bdir/cs_names_r.bkp $IMGDIR/cs_names_r.bkp
> +
> +--echo 
> +--echo # Checking stored procedures and events.
> +--enable_result_log
> +USE dbA;


Q) BACKUP does not change the current database, right? So in theory this
should not be required here?

...
> === modified file 'sql/backup/api_types.h'
...
> +  /// Definition of a name buffer for the database name.
> +  typedef char name_buf[FN_REFLEN];


8) FN_REFLEN seems to be overkill for a database name. Looking for
constants used elsewhere, I found NAME_LEN. BTW at many places
check_and_convert_db_name() is used. Do we/can we use this in backup
code/si_objects code too?

...
> === modified file 'sql/backup/backup_info.cc'
...
> @@ -1379,6 +1370,45 @@ Backup_info::add_db_object(Db &db, const
>    }
>  
>    /*
> +    If this is a trigger, get the table for the trigger. If we fail here,
> +    we abort operation without inserting the object into the catalog.
> +  */
> +  if (type == BSTREAM_IT_TRIGGER)
> +  {
> +    THD *thd= ::current_thd;
> +    obs::Obj *tbl_obj= obs::find_table_for_trigger(thd, &db.name(), 
> +                                                        obj->get_name());


9) Now that we can be sure to have an obj reference in every catalog
item, can't we pass the database's obj instead of its name? Won't that
make a pointer comparison instead of string compare, when searching?
OTOH that optimization might be out of scope for this bug fix.

...
> @@ -1401,21 +1431,20 @@ Backup_info::add_db_object(Db &db, const
...
> +  const char *err;
>    switch (type) {
...
> +  case BSTREAM_IT_DB:        err= "ER_BACKUP_CATALOG_ADD_DB";      break;
> +  case BSTREAM_IT_VIEW:      err= "ER_BACKUP_CATALOG_ADD_VIEW";    break;
> +  case BSTREAM_IT_SPROC:     err= "ER_BACKUP_CATALOG_ADD_SROUT_P"; break;
> +  case BSTREAM_IT_SFUNC:     err= "ER_BACKUP_CATALOG_ADD_SROUT_F"; break;
> +  case BSTREAM_IT_EVENT:     err= "ER_BACKUP_CATALOG_ADD_EVENT";   break;
> +  case BSTREAM_IT_TRIGGER:   err= "ER_BACKUP_CATALOG_ADD_TRIGGER"; break;
> +  case BSTREAM_IT_PRIVILEGE: err= "ER_BACKUP_CATALOG_ADD_PRIV";    break;
>    default: break;
>    }
...
> +  if (err)
> +    DBUG_EXECUTE_IF(err, res= get_dep_node_res::ERROR;);
> +


10) I request to include the whole section in #ifndef DBUG_OFF. No need
to have the variable, the switch and all the strings in a production kernel.

...
> === modified file 'sql/backup/image_info.cc'
...
> -///  Check if catalogue contains given database.
> +/**
> +  Find in the catalog a database with the given name.
> +
> +  @return Pointer to the catalog entry for the database if found, 
> +  NULL otherwise.
>  
> -bool Image_info::has_db(const String &db_name) const
> +  @todo More efficient implementation, e.g. using hash.
> +*/


11) Please add @param db_name to the function comment.

> +
> +Image_info::Db* Image_info::find_db(const String &db_name) const
>  {
>    for (uint n=0; n < m_dbs.count() ; ++n)
> -    if (m_dbs[n] && m_dbs[n]->name() == db_name)
> -      return TRUE;
> +    if (m_dbs[n] && m_dbs[n]->m_obj_ptr->is_my_name(&db_name))
> +      return m_dbs[n];
> +
> +  return NULL;
> +}
> +
> +
> +/**
> +  Find in the catalog a tablespace with the given name.
> +
> +  @return Pointer to the catalog entry for the tablespace if found, 
> +  NULL otherwise.
> +
> +  @todo More efficient implementation, e.g. using hash.
> +*/


11) Please add @param ts_name to the function comment.

>  
> -  return FALSE;
> +Image_info::Ts* Image_info::find_ts(const String &ts_name) const


...
> === modified file 'sql/backup/image_info.h'
...
> @@ -1191,29 +1179,47 @@ const char* Image_info::Db::describe(des
>    about the table or NULL if the table is not found.
>  */
>  inline
> -Image_info::Table* Image_info::Db::find_table(const String &table_name)
> +Image_info::Table* 
> +Image_info::Db::find_table(const String &name) const
>  {
>    for (Table *tbl=first_table; tbl ; tbl= tbl->next_table)
>    {


Q) Why don't we use an iterator here?

...
> === modified file 'sql/backup/kernel.cc'
...
> @@ -1493,7 +1492,6 @@ int Backup_restore_ctx::restore_triggers
>  
>        case BSTREAM_IT_TRIGGER:
>        {
> -        DBUG_ASSERT(obj->m_obj_ptr);
>          int ret= obj->m_obj_ptr->create(m_thd);


Q) Why can we rely on this now, while we could not before? After all
these references have been set immediately after add_*.

...
> @@ -2417,19 +2476,16 @@ int bcat_create_item(st_bstream_image_he
...
> +  DBUG_ASSERT(obj->m_obj_ptr);
>  
> +  const ::String *name= obj->m_obj_ptr->get_name(); 


Q) Isn't it inconsistent to add an assert here, while removing it above?

...
> @@ -2439,42 +2495,30 @@ int bcat_create_item(st_bstream_image_he
...
> +  const char *err= NULL;
>    switch (item->type) {
...
> +  case BSTREAM_IT_DB:      err= "ER_BACKUP_CANT_RESTORE_DB";      break;
> +  case BSTREAM_IT_TABLE:   err= "ER_BACKUP_CANT_RESTORE_TABLE";   break;
> +  case BSTREAM_IT_VIEW:    err= "ER_BACKUP_CANT_RESTORE_VIEW";    break;
> +  case BSTREAM_IT_SPROC:   err= "ER_BACKUP_CANT_RESTORE_SROUT_P"; break;
> +  case BSTREAM_IT_SFUNC:   err= "ER_BACKUP_CANT_RESTORE_SROUT_F"; break;
> +  case BSTREAM_IT_EVENT:   err= "ER_BACKUP_CANT_RESTORE_EVENT_2"; break;
> +  case BSTREAM_IT_TRIGGER: err= "ER_BACKUP_CANT_RESTORE_TRIGGER"; break;
> +  case BSTREAM_IT_PRIVILEGE:  err= "ER_BACKUP_CANT_RESTORE_PRIV"; break;
>    default: break;
>    }
...
> +  if (err)
> +    DBUG_EXECUTE_IF(err, ret= ERROR;);


10) Again, please put in #ifndef DBUG_OFF.

...
> @@ -2714,6 +2764,17 @@ const char* Table_ref::internal_name(cha
>  
>  
>  /**
> +  Produce human readable string identifying the database
> +  (e.g. for error reporting).
> +*/


11) Please add @param and @return info.

...
> === modified file 'sql/si_objects.cc'
...
> @@ -664,30 +662,22 @@ private:
...
> +/**
> +  Create instance of an object with given name.
>  
> -inline Abstract_obj::Abstract_obj(LEX_STRING id)
> +  @param[in]  id       Name of the object.
> +  @param[in]  lc_name  Should id be converted to lower case.
> +*/ 


11) Please add @return.

...

Regards
Ingo
-- 
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder,   Wolfgang Engels,   Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Häring   HRB München 161028



Thread
bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903) Bug#43596Rafal Somla4 Dec
  • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Charles Bell7 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Charles Bell8 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Ingo Strüwing9 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Rafal Somla10 Dec
  • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Ingo Strüwing9 Dec
    • Re: bzr commit into mysql-6.0-backup branch (Rafal.Somla:2903)Bug#43596Rafal Somla10 Dec