From: Jim Hoadley Date: October 22 2004 7:18am Subject: Re: node instability List-Archive: http://lists.mysql.com/cluster/945 Message-Id: <20041022071818.6787.qmail@web41907.mail.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Tomas -- > that leak was fixed (and it was huge), and it was in the mysqld, not > in the ndbd nodes. And, by way of full disclosure, I have a MySQL API running on both of these nodes (COOLER, TORMAN). > What is the size of the process on COOLER, which has been up for 13 > days? Has it grown? Size of ndbd on COOLER is USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 4044 0.0 0.0 6332 4 ? S Oct08 0:00 ndbd root 4045 0.0 2.9 420328 30700 ? R Oct08 5:41 ndbd Size of mysqld on COOLER is USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND mysql 4107 0.0 1.6 125728 17328 ? S Oct08 12:02 /usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/var --user=mysql --pid-file=/usr/local/mysql/var/cooler.pid --skip-locking --ndbcluster --default-storage-engine=ndbcluster Size of mysqld on TORMAN is USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND mysql 2472 0.2 1.9 125828 19724 ? S Oct08 43:50 /usr/local/mysql/libexec/mysqld --basedir=/usr/local/mysql --datadir=/usr/local/mysql/var --user=mysql --pid-file=/usr/local/mysql/var/torman.pid --skip-locking --ndbcluster --default-storage-engine=ndbcluster > Please provide the cluster log, error log and tracefiles. I will send these 4 files in a separate email since they are too large for this list. torman_ndb_1_error.log - error log of node that was shutdown torman_ndb_1_out.log - stdout of node that was shut down cooler_ndb_3_cluster.log - ndb_mgmd cluster log cooler_ndb_1_trace.log.3 - trace file Also, here's 'cat /proc/meminfo' on COOLER just before node was shut down: Thu Oct 21 08:02:15 PDT 2004 total: used: free: shared: buffers: cached: Mem: 1055277056 472301568 582975488 0 94691328 139239424 Swap: 1069277184 0 1069277184 MemTotal: 1030544 kB MemFree: 569312 kB MemShared: 0 kB Buffers: 92472 kB Cached: 135976 kB SwapCached: 0 kB Active: 245968 kB ActiveAnon: 101312 kB ActiveCache: 144656 kB Inact_dirty: 87472 kB Inact_laundry: 4244 kB Inact_clean: 248 kB Inact_target: 67584 kB HighTotal: 130992 kB HighFree: 2036 kB LowTotal: 899552 kB LowFree: 567276 kB SwapTotal: 1044216 kB SwapFree: 1044216 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB And just after node was shut down: Thu Oct 21 08:03:15 PDT 2004 total: used: free: shared: buffers: cached: Mem: 1055277056 399458304 655818752 0 94695424 140222464 Swap: 1069277184 0 1069277184 MemTotal: 1030544 kB MemFree: 640448 kB MemShared: 0 kB Buffers: 92476 kB Cached: 136936 kB SwapCached: 0 kB Active: 172852 kB ActiveAnon: 28188 kB ActiveCache: 144664 kB Inact_dirty: 88428 kB Inact_laundry: 4244 kB Inact_clean: 248 kB Inact_target: 53152 kB HighTotal: 130992 kB HighFree: 38776 kB LowTotal: 899552 kB LowFree: 601672 kB SwapTotal: 1044216 kB SwapFree: 1044216 kB HugePages_Total: 0 HugePages_Free: 0 Hugepagesize: 4096 kB Thanks again! -- Jim Hoadley Sr Software Eng Dealer Fusion, Inc --- Tomas Ulin wrote: > Jim, > > that leak was fixed (and it was huge), and it was in the mysqld, not in > the ndbd nodes. > > What is the size of the process on COOLER, which has been up for 13 > days? Has it grown? > > Please provide the cluster log, error log and tracefiles. > > T > > Jim Hoadley wrote: > > >Hello all -- > > > >History: Some nodes in my 4-computer and 2-computer cluster would > >only stay up for a day or two. I posted this problem on this list > >as thread "nightly crashing" and received helpful suggestions. > > > >Ultimately it was thought there was a memory leak. Mikael and Tomas > >said this was fixed in mysql-4.1.6-gamma-nightly-20041001.tar.gz. > >I installed this new version, increased memory so both hosts have 1GB > >of RAM. Now here's my feedback. > > > >One host (COOLER) has been up for 13 days, since 3:30pm, Friday, 10/08/2004. > > > >The other (TORMAN) died after 1 1/2 days, was restarted, and died again > >after 11 days. Both times it was shut down by the arbirator (running on > >COOLER) because of missed heartbeats. > > > >I would be happy to post the trace file and all log messages if it would > >be helpful, but my general questions are: > > > >Shouldn't I expect my nodes to run for more than a few days before crashing? > > > >Does anyone else have this problem? > > > >Was this "memory leak" actually fixed in mysql-4.1.6-gamma-nightly-20041001? > > > >Thanks in advance for any help you can provide. > > > >-- Jim Hoadley > > Sr Software Eng > > Dealer Fusion, Inc > > _______________________________ Do you Yahoo!? Declare Yourself - Register online to vote today! http://vote.yahoo.com