List:Commits« Previous MessageNext Message »
From:Magnus Svensson Date:October 16 2007 9:25am
Subject:Re: bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099
View as plain text  
tis 2007-10-16 klockan 11:31 +0300 skrev Georgi Kodinov:
> Magnus,
> 
> On 16.10.2007, at 10:52, Magnus Svensson wrote:
> 
> > I don't really see how this patch is related to the problem at  
> > hand. The
> > original problem the bug reports is that you get and additional  
> > warning
> > in the result file.
> >
> > ***r/system_mysql_db.result	Sun Mar 11 22:51:31 2007
> > --- r/system_mysql_db.reject	Sun Mar 11 23:29:06 2007
> > ***************
> > *** 251,255 ****
> > --- 251,257 ----
> >     `server_id` int(11) DEFAULT NULL,
> >     `sql_text` mediumtext NOT NULL
> >   ) ENGINE=CSV DEFAULT CHARSET=utf8 COMMENT='Slow log'
> > + Warnings:
> > + Error	1194	Table 'slow_log' is marked as crashed and should be  
> > repaired
> >   show tables;
> >   Tables_in_test
> > -------------------------------------------------------
> >
> >
> > And I assume that is not fixed by the patch?
> >
> > But, lately we have been seeing that system_mysql_db fails with  
> > timeout in PushBuild
> > and I guess that is what the patch intend to fix? It's strange  
> > though that we get these
> > timeout failures when running with a release compiled mysqld and  
> > thus no code from dbug.c
> > should be involved. We don't pass --debug to mysql-test-run.pl  
> > either so how would we need
> > extend the timeout?
> >
> > I think there must be some other problem.
> 
> This bug (the way I read it) actually consists of two problems :
>   1. the warning
>   2. The timeout
> 
> The first one seems to be no longer repeatable, as you have mentioned  
> yourself (27 Jun) :
> The reported problem hasn't occured in a fresh tree for a long time.  
> Only one failure with
> "timeout" reported - not same problem though.

Yes, I just did another check and it seems like "mysql-5.1-telco-6.1"
get it constantly, but I think they are not up to date with the mysqld
source in that tree.

This is the search I used.
https://intranet.mysql.com/secure/pushbuild/xref.pl?startdate=&enddate=&dir=&plat=&testtype=&testname=system_mysql_db&testtext=&limit=50


> 
> 2007-06-21 22:14:35 	mysql-5.1 	4  
> (lthalmann@stripped) 	vm-win2003-32-a
> 	ps_row 	system_mysql_db 	[log]
> 
> system_mysql_db                [ fail ]  timeout"
> 
> 
> So I've concentrated on the second one.

Yes, and there is till only _one_ such error in PushBuild's xref.
https://intranet.mysql.com/secure/pushbuild/xref.pl?startdate=&enddate=&dir=&plat=&testtype=&testname=system_mysql_db&testtext=timeout&limit=50

As you say we should probably get rid of the bug.

> I've tried the original way to reproduce and haven't been able to  
> recreate it.
> Then I've found your comment on (14 Mar):
> $ ./mysql-test-run.pl show_check skip_grants --check-testcases --debug
> 
> I've tried the above and discovered that the test was running very  
> slow because of the constant freopen() (I presume). So after a  
> discussion I did two things (as mentioned in my commit comment):
> 1. increased the timeout of this particular debug run (as the normal  
> one passed with me several  times).

Why would we need to do this? We know it will take longer time to run
with --debug turned on, but do we need to increase the timeout? In such
a case we should do the *10 for all platforms. Especially since you know
has made it faster on windows... and can't repeat it. ;)

> 2. after a discussion with Reggie I've removed the freopen and  
> substituted it with a special fopen() flag for the debug log file  
> that ensures that fflush() flushes to disk and not to OS buffers.

Yes, that looks like a good idea. Could you measure the difference?

> 
> I really think we should not aggregate more than 1 problem in 1 bug  
> report (makes for confusion about the context among others). So as  
> long as we agree that the first problem is gone we should close this  
> bug. My patch is indeed more or less optional, but IMHO it's a step  
> in the right direction and this is the reason I've added it to this bug.

ok, let's hope the problem is gone. I hade it in my bug list a long time
and haven't managed to really pinpoint the problem more exact than what
my comment as of 14 mar says:
"Seems like this occurs when mysqld decides to write a query to the slow
log. It will open it for writing and set the "crashed" flag in the
metafile. The crashed flag will not be reset, either because log is
never closed or because use_count is not incremented when ha_tina is
used from log handler."

Although I'm quite sure it still _can_ happen. But as you say, until we
know exactly how it happens, we should close the bug.

> I would also really like to see good steps to reproduce for this  
> bug : just quoting xref is not a good enough way to reproduce a bug.  
> It would be great (for similar PB failure bugs) if the bug passes  
> either through the bug verification team or enough evidence is  
> collected (and preserved when the problem occurs).

Absolutely. It was repeatable for quite some time but not anymore.

Best regards
Magnus

-- 
Magnus Svensson
Senior Software Engineer
msvensson@stripped
+46709164491

Thread
bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099kgeorge15 Oct
  • Re: bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099Magnus Svensson16 Oct
    • Re: bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099Georgi Kodinov16 Oct
      • Re: bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099Magnus Svensson16 Oct
        • Re: bk commit into 5.1 tree (gkodinov:1.2584) BUG#27099Georgi Kodinov16 Oct