>>>>> "Joe" == Joe Kislo <kislo@stripped> writes:
Joe> There definatly seems to be a race condition in MySQL relating to
Joe> Dropping a table which you have a lock on, and somebody trying to aquire
Joe> said lock at the same time.
Joe> Over the past couple months our application has come to a grinding stop
Joe> ~6 times because of this problem. It appears that if the thread which
Joe> holds the write lock on the table [in this case it's got two tables
Joe> locked], drops one of the tables while another thread is trying to
Joe> aquire the lock [in this case two locks], the first thread will hang
Joe> "waiting on cond", and the second one will acctually somehow have
Joe> aquired the lock. Everything at that point freezes, and additional
Joe> threads start queueing up waiting for the tables. Here's an except of
Joe> the processlist when this happens:
Joe> | 4151 | TT_qa1 | localhost | TT_qa1 | Query | 49159 |
Joe> Waiting for table | LOCK TABLES Event_CLASS WRITE, Event WRITE; |
Joe> | 4152 | TT_qa1 | localhost | TT_qa1 | Query | 49203 |
Joe> Waiting on cond | DROP table Event; |
Joe> | 4153 | TT_qa1 | localhost | TT_qa1 | Query | 49203 |
Joe> I can't quite tell, but I think may have cut off part of the line on
Joe> the last thread.. I made that cut+paste earlier today. Like I've said,
Joe> this happens -fairly- often to us, and it completely debilitating.
Joe> I've tried to randomize the event processing so it doesn't happen
Joe> exactly at the same time... but it still nails us sometimes. I know,
Joe> you're going to say, don't drop tables you have locks on.. I reported a
Joe> bug about 6 months ago about the mysqlserver segfaulting when you
Joe> dropped a table you had a lock on... and that was somebody's response
Joe> instead of fixing it. Eventually after some people exclaimed that under
Joe> NO case should the server segfault; somebody posted a patch. I realize
Joe> this isn't as BAD as segfaulting the server; but it -really- should get
Joe> fixed in the server.
As soon as the above problem (segfault) got to my attention I provided
a fix for it; I think there may have been a small delay as I may have
been on a conference trip when you reported it first time, but this is
the only reason it wasn't fixed at once; We are really trying to fix
the bugs as soon as it comes to our attention!
I thought that I did fix the server :( After the fix, I couldn't get
any test program to produce anything wrong...
Joe> Here's a snapshot from the mysql log at the time:
Joe> 000301 21:30:08 4152 Query show tables like 'Event_CLASS'
Joe> 4152 Query show tables like 'Event'
Joe> 4152 Query LOCK TABLES Event_CLASS WRITE, Event
Joe> 4153 Query show tables like 'Event_CLASS'
Joe> 4152 Query SELECT ClassKey, cl_previousEventID
Joe> from Even
Joe> 4153 Query show tables like 'Event'
Joe> 4152 Query SELECT Event.eventAction,
Joe> d, Event.courseID, Event.sectionID, Event.eventCode,
Joe> Event.rawTimeUpdated, Event
Joe> .eventTime, Event.eventID, Event.roundID from Event where (
Joe> 964208.868' );
Joe> 4153 Query LOCK TABLES Event_CLASS WRITE, Event
Joe> 4152 Query DELETE from Event where eventID='35';
Joe> 4152 Query SELECT count(eventID) from Event;
Joe> 4152 Query DROP table Event;
Joe> As you can see, 4152 aquires the lock.. Does some stuff, then SOMEHOW
Joe> 4153 ALSO aquires the SAME lock... Then when 4152 tries to drop the
Joe> table... Poopie --> fan.
I can't see from the above that 4153 would get any LOCK on the table.
Any change that you can try and test 3.23.12 with this when it comes
ut (hopefully within 48 hours)? We did a small redesign of the DROP /
LOCK code in 3.23 and it should be much better than the old one.