List:Cluster« Previous MessageNext Message »
From:Jim Hoadley Date:October 22 2004 7:18am
Subject:Re: node instability
View as plain text  
Tomas --

> that leak was fixed (and it was huge), and it was in the mysqld, not 
> in the ndbd nodes.

And, by way of full disclosure, I have a MySQL API running on both of 
these nodes (COOLER, TORMAN).

> What is the size of the process on COOLER, which has been up for 13 
> days?   Has it grown?

Size of ndbd on COOLER is

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
root      4044  0.0  0.0  6332    4 ?        S    Oct08   0:00 ndbd
root      4045  0.0  2.9 420328 30700 ?      R    Oct08   5:41 ndbd

Size of mysqld on COOLER is

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
mysql     4107  0.0  1.6 125728 17328 ?      S    Oct08  12:02
/usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql
--datadir=/usr/local/mysql/var --user=mysql
--pid-file=/usr/local/mysql/var/cooler.pid --skip-locking --ndbcluster
--default-storage-engine=ndbcluster

Size of mysqld on TORMAN is

USER       PID %CPU %MEM   VSZ  RSS TTY      STAT START   TIME COMMAND
mysql     2472  0.2  1.9 125828 19724 ?      S    Oct08  43:50
/usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql
--datadir=/usr/local/mysql/var --user=mysql
--pid-file=/usr/local/mysql/var/torman.pid --skip-locking --ndbcluster
--default-storage-engine=ndbcluster

> Please provide the cluster log, error log and tracefiles.

I will send these 4 files in a separate email since they are too large for 
this list.

torman_ndb_1_error.log - error log of node that was shutdown
torman_ndb_1_out.log - stdout of node that was shut down
cooler_ndb_3_cluster.log - ndb_mgmd cluster log
cooler_ndb_1_trace.log.3 - trace file

Also, here's 'cat /proc/meminfo' on COOLER just before node was shut 
down:

Thu Oct 21 08:02:15 PDT 2004
        total:    used:    free:  shared: buffers:  cached:
Mem:  1055277056 472301568 582975488        0 94691328 139239424
Swap: 1069277184        0 1069277184
MemTotal:      1030544 kB
MemFree:        569312 kB
MemShared:           0 kB
Buffers:         92472 kB
Cached:         135976 kB
SwapCached:          0 kB
Active:         245968 kB
ActiveAnon:     101312 kB
ActiveCache:    144656 kB
Inact_dirty:     87472 kB
Inact_laundry:    4244 kB
Inact_clean:       248 kB
Inact_target:    67584 kB
HighTotal:      130992 kB
HighFree:         2036 kB
LowTotal:       899552 kB
LowFree:        567276 kB
SwapTotal:     1044216 kB
SwapFree:      1044216 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB

And just after node was shut down:

Thu Oct 21 08:03:15 PDT 2004
        total:    used:    free:  shared: buffers:  cached:
Mem:  1055277056 399458304 655818752        0 94695424 140222464
Swap: 1069277184        0 1069277184
MemTotal:      1030544 kB
MemFree:        640448 kB
MemShared:           0 kB
Buffers:         92476 kB
Cached:         136936 kB
SwapCached:          0 kB
Active:         172852 kB
ActiveAnon:      28188 kB
ActiveCache:    144664 kB
Inact_dirty:     88428 kB
Inact_laundry:    4244 kB
Inact_clean:       248 kB
Inact_target:    53152 kB
HighTotal:      130992 kB
HighFree:        38776 kB
LowTotal:       899552 kB
LowFree:        601672 kB
SwapTotal:     1044216 kB
SwapFree:      1044216 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     4096 kB

Thanks again!

-- Jim Hoadley
   Sr Software Eng
   Dealer Fusion, Inc






--- Tomas Ulin <tomas@stripped> wrote:

> Jim,
> 
> that leak was fixed (and it was huge), and it was in the mysqld, not 
in 
> the ndbd nodes.
> 
> What is the size of the process on COOLER, which has been up for 13 
> days?   Has it grown?
> 
> Please provide the cluster log, error log and tracefiles.
> 
> T
> 
> Jim Hoadley wrote:
> 
> >Hello all --
> >
> >History: Some nodes in my 4-computer and 2-computer cluster would
> >only stay up for a day or two. I posted this problem on this list
> >as thread "nightly crashing" and received helpful suggestions.
> >
> >Ultimately it was thought there was a memory leak. Mikael and Tomas
> >said this was fixed in mysql-4.1.6-gamma-nightly-20041001.tar.gz.
> >I installed this new version, increased memory so both hosts have 
1GB
> >of RAM. Now here's my feedback.
> >
> >One host (COOLER) has been up for 13 days, since 3:30pm, Friday, 
10/08/2004.
> >
> >The other (TORMAN) died after 1 1/2 days, was restarted, and died 
again
> >after 11 days. Both times it was shut down by the arbirator (running 
on
> >COOLER) because of missed heartbeats.
> >
> >I would be happy to post the trace file and all log messages if it 
would
> >be helpful, but my general questions are:
> >
> >Shouldn't I expect my nodes to run for more than a few days before 
crashing?
> >
> >Does anyone else have this problem?
> >
> >Was this "memory leak" actually fixed in 
mysql-4.1.6-gamma-nightly-20041001?
> >
> >Thanks in advance for any help you can provide.
> >
> >-- Jim Hoadley
> >   Sr Software Eng
> >   Dealer Fusion, Inc
> >


		
_______________________________
Do you Yahoo!?
Declare Yourself - Register online to vote today!
http://vote.yahoo.com
Thread
node instabilityJim Hoadley21 Oct
  • Re: node instabilityTomas Ulin21 Oct
Re: node instabilityJim Hoadley22 Oct