List:General Discussion« Previous MessageNext Message »
From:Gerald Clark Date:March 23 2001 2:16pm
Subject:Re: Connection related SIG 11 crash in 3.23
View as plain text  
Voytek Lapinski wrote:
> 
> Ok... I've posted briefly regarding this, but heres a complete bug report.
> Yeah I know its kinda long, but I thought it better to include more rather
> than less.
> 
> We are seeing a problem which results in mysqld dying with a SIGSEGV. The
> problem has been confirmed on both these machines
> 
> Linux lindt 2.2.18 #2 SMP Mon Feb 26 15:23:32 EST 2001 i686 unknown
> Dell PowerEdge 4300 - Dual Pentium 3-500 - 1Gb RAM, 1.6Gb swap
> 
> Linux lisa 2.4.1 #7 SMP Fri Feb 16 15:21:45 EST 2001 i686 unknown
> IBM Rackmount Server 325 - Dual Pentium Pro-200 - 512Mb RAM, 1Gb Swap
> 
> And using mysqld versions 3.23.32,3.23.33 and 3.23.35. To confirm that the
> problem isn't related to libraries, compilers or configuration specific to our
> build environment, I have also tested the standard linux-i386 binary distro of
> 3.23.35, and the problem also exists there.
> 
> The error message takes the following form:
> 
> mysqld got signal 11;
> The manual section 'Debugging a MySQL server' tells you how to use a
> stack trace and/or the core file to produce a readable backtrace that may
> help in finding out why mysqld died.
> Attempting backtrace. You can use the following information to find out
> where mysqld died.  If you see no messages after this, something went
> terribly wrong...
> Cannot determine thread, ebp=0xbffff108, backtrace may not be correct.
> Stack range sanity check OK, backtrace follows:
> 0x813fefa
> 0x813f6dd
> 0x8081690
> 0x8081b14
> 0x808127e
> 0x8145e4b
> 0x8048111
> New value of ebp failed sanity check, terminating backtrace!
> 
> Naturally with different values for the stack backtrace depending on version
> (the addresses look similarly spaced though, so it's probably the same thing).
> 
> The problem appears to be related to connections which occur at roughly the
> same time. I have written a script which reliably causes it to occur and have
> included it below. It operates by forking off lots of processes all of which
> connect and disconnect to the database repeatedly until it goes down. In most
> cases, the database will crash within seconds, although rarely it seems
> to take significantly longer (several minutes). The problem I am referring to
> is associated with local UNIX domain socket connections, but when I briefly
> tested connecting over TCP/IP it (or something else similar) still occurs in
> that it eventually stops accepting connections on that port (although the
> database is still running and generates no errors). This may be something
> else entirely though, and I have not investigated that at all.
> 
> While this case seems somewhat artificial, we were seeing the same crashes
> under normal operation of our application which seemed to be associated with
> the activation of a process that frequently will fork of 6 new processes, all
> of which attempt to connect into the database immediately after starting.
> 
> The test script follows:
> 
> --START SCRIPT--
> 
> #!/usr/bin/perl -w
> 
> use DBI;
> 
> for(my $i=0;$i<100;$i++) {
> 
>     my $fork_ret=fork;
> 
>     if (not defined($fork_ret)) {
>         die("$! fork failed\n");
>     } elsif ($fork_ret == 0) {
>         while(1) {
>             my$
>
> dbh=DBI->connect('DBI:mysql:database=test','test','test',{PrintError=>1,RaiseError=>0});
>             exit unless (defined($dbh));
>             $dbh->disconnect();
>         }
>     }
> }
> 
> my $start=time();
> my $wait_ret;
> do {
>     $wait_ret=wait;
> } until $wait_ret == -1;
> 
> print ((time()-$start)."s\n");
> 
> --END OF SCRIPT---
> 
> The [mysql] section of our my.cnf is:
> 
> --START CONFIG--
> 
> [mysqld]
> 
> port        = 3333
> socket      = /tmp/mysql.sock
> skip-locking
> set-variable    = key_buffer=32M
> set-variable    = max_allowed_packet=1M
> set-variable    = thread_stack=128K
> set-variable    = tmp_table_size=8M
> set-variable    = max_connections=200
> 
> --END CONFIG--
> 
> But I have tried more conservative values for key_buffer (down to 4M),
> thread_stack(64K) and tmp_table_size(2M).  The problem also seems to occur
> with values of max_connections anywhere from 20 to 10000 (Yeah I know 10000 is
> silly with LinuxThreads, but I thought I'd do it to see what happens).
> 
> When invoking mysql, many options have been used, but the most conservative
> settings we tried were
> 
> /usr/local/mysql-3.23.35-bindist/bin/mysqld --datadir=/data --user=root
> 
> The problem can be made to take longer to happen by a few things, but of
> course this is rather subjective. It is harder to trigger when using the
> 3.23.35 binary distro, trying on the faster machine and using larger values
> for max_connections. We have been unable to make it go away entirely though.
> Eventually, it will always go down.
> 
> <whew>
> 
> that's it... I'm happy to provide any more info, as we would really love to
> see this fixed.
> 
> Oh yeah, another weird thing that we noticed while testing this that may or
> may not be related is that once in a while when running the test script, a
> user would appear to be logged in as their password instead of their login
> name. No investigation was done on this issue.
> 
> --
> Voytek Lapinski voytekl@stripped
> Software Engineer     Hotkey Internet
> ph:  03 9923 3656   mob: 0427 469 891
> fax: 03 9923 3388   www.hotkey.net.au
> 
> ---------------------------------------------------------------------
>
I have been following the SIG 11 threads that have appeared lately.
It seems to me all reporters have been running SMP kernels.

What happens if you bring the machine up single CPU?
Thread
Connection related SIG 11 crash in 3.23Voytek Lapinski23 Mar
  • Re: Connection related SIG 11 crash in 3.23Gerald Clark23 Mar
  • Re: Connection related SIG 11 crash in 3.23Joseph Bueno23 Mar
  • Re: Connection related SIG 11 crash in 3.23vinod p23 Mar
Re: Connection related SIG 11 crash in 3.23Richard Ellerbrock23 Mar