MySQL Lists are EOL. Please join:

List:Commits« Previous MessageNext Message »
From:Dmitry Lenev Date:December 2 2008 5:28pm
Subject:Re: bzr commit into mysql-6.0-runtime branch (magne.mahre:2752)
Bug#38661
View as plain text  
Hello Magne!

Here are my comments about your patch:

* Magne Mahre <Magne.Mahre@stripped> [08/11/25 12:58]:
> #At file:///home/mm136784/z/mysql-6.0-runtime/ based on
> revid:kostja@stripped
> 
>  2752 Magne Mahre	2008-11-24
>       Bug #38661 all threads hang in "opening tables" or 
>          "waiting for table" and cpu is at 100%
>       
>       A race between open_tables and a "flush table" operation
>       resulted in neither being able to complete.  open_tables
>       was not able to open the table and initiated a recover.
>       The conditions for completing the recovery were too
>       strict and couldn't be achieved while the flush was
>       running.  The solution was to loosen the requirement
>       that said that a share couldn't exist without a table,
>       since this is actually a valid condition in certain
>       cases. 

Ugh... I think that above description is a bit too vague.

Ideally ChangeSet comment should contain:

 - Description of the problem in the terms which can be understood by user
   (often bug title/description does not clearly explain/formulate this,
   and providing this information will really help our Doc/QA teams).
 - Explanation of what was the technical reason for this problem and how
   the bug fix resolves it.

So the above should look more like:

"
Bug #38661 'all threads hang in "opening tables" or "waiting for table"
            and cpu is at 100%'

Concurrent execution of FLUSH TABLES statement and at least two statements
using the same table might have led to live-lock which caused all three
connections to stall and hog 100% of CPU.

tdc_wait_for_old_versions() wrongly assumed that there cannot be a share
with an old version and no used TABLE instances and thus was failing to
perform wait in situation when such old share was cached in MDL subsystem
thanks to a still active metadata lock on the table. So it might have
happened that two or more connections simultaneously executing statements
which involve table being flushed managed to prevent each other from
waiting in this function by keeping shared metadata lock on the table 
constantly active (i.e. one of the statements managed to take/hold this
lock while other statements were calling tdc_wait_for_old_versions()).
Thus they were forcing each other to loop infinitely in open_tables() -
close_thread_tables_for_reopen() - tdc_wait_for_old_versions() cycle
causing CPU hogging.

This patch fixes this problem by removing this false assumption from
tdc_wait_for_old_versions().

Note that the problem is specific only for server versions >= 6.0.
"

Also in this particular case you have to explain in the comment why
this ChangeSet does not contain a test for this bug...

Is it really hard to repeat this problem in our current test suite?

> modified:
>   sql/sql_base.cc
> 
> === modified file 'sql/sql_base.cc'
> 
> === modified file 'sql/sql_base.cc'
> --- a/sql/sql_base.cc	2008-11-18 19:41:51 +0000
> +++ b/sql/sql_base.cc	2008-11-24 11:27:20 +0000
> @@ -7678,8 +7678,7 @@
>        mdl_get_tdc_key(lock_data, &key);
>        if ((share= (TABLE_SHARE*) hash_search(&table_def_cache, (uchar*)
> key.str,
>                                               key.length)) &&
> -          share->version != refresh_version &&
> -          !share->used_tables.is_empty())
> +          share->version != refresh_version)
>          break;
>      }
>      if (!lock_data)
> 

I think it is OK to push this patch after fixing or discussing with me
above issues.

And BTW thanks for looking into this problem!

-- 
Dmitry Lenev, Software Developer
MySQL AB, www.mysql.com

Are you MySQL certified?  http://www.mysql.com/certification
Thread
bzr commit into mysql-6.0-runtime branch (magne.mahre:2752) Bug#38661Magne Mahre24 Nov
  • Re: bzr commit into mysql-6.0-runtime branch (magne.mahre:2752)Bug#38661Dmitry Lenev2 Dec