Hi Chuck,
thanks for the review. Please find comments inline.
One remark in advance. My personal preference is not to receive my full
giant commit email back, but just the parts that you refer to. I had to
spend a remarkable time in scrolling down to find comments from you. And
I'm not sure that I didn't miss one. If you would at least tag your name
before every comment to support searching...
Chuck Bell, 09.06.2008 21:52:
> Ingo,
>
> A couple of things about this patch:
>
> It doesn't apply correctly to the current branch (bzr) mysql-6.0-backup.
> There are collisions/rejections in be_snapshot.cc, handler.cc and
> debug_sync.cc.
Yes, I continued to work on my existing bk tree. I have meanwhile ported
it to a new bzr tree.
I understood my task so that I show, how the tests could be converted,
rather than doing the final patch. If you want me to do the final patch,
we will need to agree on the sync point and signal names.
>
> I get timeout errors on the backup_commit_blocker test on Windows. This may
> be due to (what I fear) misplaced/misnamed breakpoints in commit blocker. I
> have attached the log (if it helps). It never seems to get past the first
> part of the test.
Unfortunately the log doesn't tell where the COMMIT hangs. A debug trace
might give a better insight:
Please start the test with the --debug option.
When it hangs, please run:
egrep 'query:|debug_sync_exec:' mysql-test/var/log/master.trace
If it doesn't help either, please start the debugger on the hanging
server and find out where the COMMIT thread hangs.
>
> Question:
> Is there any way we can add doxygen comments so that we can generate a list
> of all breakpoints used in the code? This would be a real nice to have and a
> great timesaver -- not to mention good for documentation. ;)
I can try that, but it will take some time.
...
> Please specify WL#4259 in the patch comments. It would be easier to track it
> there were.
I would prefer a number that is related to backup testing. The
conversion is not part of the implementation of DEBUG_SYNC.
>
>>
>> Changed synchronization point calls.
>> Adapted backup test cases and re-enabled them.
>> Removed all unused BACKUP_BREAKPOINTs. None is left.
>> Removed sql/backup/debug.h as it is now obsolete.
>> Removed inclusion of debug.h from many files.
>> Cleaned mysql-test/t/disabled.def.
>>
>> Fixed some random failures of wait_condition.inc with
>> information_schema.processlist.
>
> I'd like to know how you did this. Was it just the new debug synch code or
> have you fixed it some other way. I am suspicious only because I spent a
> long time trying to fix it myself (and failed).
Please note that I wrote "some". Here is what I did in this respect:
http://lists.mysql.com/commits/46949
...
>> -con5: Getting lock on commit blocker.
>> -SELECT get_lock("commit_blocker_step_1", 0);
...
>> +con3: Activate synchronization points for COMMIT.
>> +SET DEBUG_SYNC= 'within_ha_commit_trans SIGNAL commit_read_locked
>> + WAIT_FOR commit_go_done';
>
> I prefered the original names of these breakpoints over these descriptive
> names. While I don't object to changing them, I think this one in particular
> isn't intuitive. Perhaps the best way to rename them is to give them the
> names related to the steps in the commit blocker which are (from
> backup_data.cc):
From my point of view, I disagree. As I mentioned in a former email to
Rafal, I do _not_ want to name the sync points after their role in a
certain test. I want to name them so that their *position in the code*
becomes clear. The role, that a DEBUG_SYNC point can have, is flexible
as you can set, which signal it sends and/or waits for. These signals
are set in the test case and should reflect the role of the sync points
*for this test*.
A sync point is a piece of code within other code:
code...
DEBUG_SYNC(thd, "sync_point_name");
code...
Imagine we would call the sync point that sits in the middle of
ha_commit_trans() "commit_blocker_step_1". If the stored procedures team
later wants to add a test that needs to block a COMMIT at that place to
see if the right row-level locks are taken, they would have to activate
the sync point "commit_blocker_step_1", which *I* don't find intuitive.
Their test doesn't have anything to do with the commit blocker. And
there may be more tests that require to synchronize threads at this place.
So would you suggest a workaround like this:
code...
DEBUG_SYNC(thd, "commit_blocker_step_1");
DEBUG_SYNC(thd, "check_row_locks_step_3");
DEBUG_SYNC(thd, "read_committed_step_2");
DEBUG_SYNC(thd, "global_read_lock_blocks_commit_step_1");
DEBUG_SYNC(thd, "lock_escalation_during_commit_step_3");
code...
Just because that many tests need to have a sync point at that place and
all have the same right to use intuitive names?
In my sync point philosophy, the sync point name reflects its position
in the code and all the different tests that use it use intuitive signal
names:
code...
DEBUG_SYNC(thd, "within_ha_commit_trans");
code...
Backup test:
SET DEBUG_SYNC= 'within_ha_commit_trans SIGNAL commit_blocker_step_1
WAIT_FOR commit_do_finish';
SP test:
SET DEBUG_SYNC= 'within_ha_commit_trans SIGNAL check_row_locks_step_3
WAIT_FOR commit_now';
Read committed test:
SET DEBUG_SYNC= 'within_ha_commit_trans SIGNAL read_committed_step_2
WAIT_FOR now_execute_commit';
... You get the idea. I must admit that I did not use signal names in
the backup tests that are like the former breakpoint names. This was an
attempt to name them after my understanding of what the test tries to
do. I'd happily change the signal names to something that is clearer to
you as you are more likely to maintain these tests.
I am also open for suggestion, how to rename the sync points so that
their position in the code becomes more clear. But I politely refuse to
rename the sync points after their function in a specific test. If you
still want to have that, please take over.
>
> ALGORITHM
> The algortihm is implemented using five steps.
>
> 1) Preventing new write locks on tables -- lock_global_read_lock()
...
> 5) unlock_global_read_lock()
If I would have seen this description of the five steps before, I
probably would have chosen the signal names as mentioned above. I wish
this explanation could become a comment in the test case.
...
>> -SELECT get_lock("backup_cs_reading", 100);
>> -get_lock("backup_cs_reading", 100)
>> -1
>> +#
>> +# Test 2: Check for consistent read after open and lock tables
>> +#
>> +con1: Activate sync points for the backup statement.
>> +SET DEBUG_SYNC= 'when_backup_cs_reading SIGNAL reading
>
> I like this naming scheme -- it matches the originals much better and is
> easy to understand. ;)
At least for you. ;)
Honestly, I couldn't think of a name that describes the position of the
sync point in the code better. :(
And after all it is pretty unlikely that other teams may have
synchronization needs at that place in code.
...
>> @@ -1501,6 +1525,8 @@ bool wait_if_global_read_lock(THD *thd,
>> (void) pthread_mutex_lock(&LOCK_global_read_lock);
>> if ((need_exit_cond= must_wait))
>> {
>> + const char *new_message= "Waiting for release of readlock";
>> +
>
> Can you explain why moving the comment to here was needed? I don't see the
> significance or why we need to do that. Or this just an addition to fix an
> omission?
...
>> + if (must_wait)
>> + {
>> + thd_proc_info(thd, new_message);
>> + DEBUG_SYNC(thd, "wait_if_global_read_lock");
>> + }
>> +#endif /* defined(ENABLED_DEBUG_SYNC) */
>> +
>> old_message=thd->enter_cond(&COND_global_read_lock,
>> &LOCK_global_read_lock,
>> - "Waiting for release of readlock");
>> + new_message);
I am not sure why you say "comment". I guess you mean the string literal
that is to be used as 'proc_info'?
I hope my quoting makes clear that I used 'new_message' at two places
and wanted to avoid using two equal string literals. At first because of
saved string space, at second because of possible inconsistencies when
one string literal is changed and the other one forgotten.
...
Regards
Ingo
--
Ingo Strüwing, Database Group
Sun Microsystems GmbH, Sonnenallee 1, D-85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schröder, Wolfgang Engels, Dr. Roland Bömer
Vorsitzender des Aufsichtsrates: Martin Häring HRB München 161028