List:Cluster« Previous MessageNext Message »
From:Johan Andersson Date:October 1 2010 6:37am
Subject:Re: MySQL Cluster -- Table Full
View as plain text  
GCP Stops can happen b/c
1) swapping
2) os backups / copying of large files
3) slow disks / too few disks and too much IO (write cache enabled?)
4) bugs

What was the io load (from iostat -mx 1)  when the crash happened?
What does free -m show?

BR
johan andersson

Aaron Weller // Crucial Paradigm wrote:
> Hey Johan and Jonas,
>
> Thanks for your help so far!  I've been working with Karl on these 
> issues.
>
> We ended up taking your suggestions and did the following:
>
> * Ran NDB initial on both nodes
> * Re-created tablespace using a single 500GB file
> * Re-created undo log using a single 50GB file
> * Set max rows to 650 million
>
> When we started restoring databases of anything larger than a few 
> thousand records, or when we used mysqlslap to hit the database with a 
> decent number of queries after a few seconds it would crash and we 
> would get an error like this in the logs:
>
> ---------------------------------
> 2010-09-30 11:39:22 [ndbd] INFO     -- Node 4 killed this node because 
> GCP stop was detected
> 2010-09-30 11:39:22 [ndbd] INFO     -- NDBCNTR (Line: 274) 0x00000006
> 2010-09-30 11:39:22 [ndbd] INFO     -- Error handler shutting down system
> 2010-09-30 11:39:22 [ndbd] INFO     -- Error handler shutdown 
> completed - exiting
> Time: Thursday 30 September 2010 - 11:39:22
> Status: Temporary error, restart node
> Message: System error, node killed during node restart by other node 
> (Internal error, programming error or missing error message, please 
> report a bug)
> Error: 2303
> Error data: Node 4 killed this node because GCP stop was detected
> Error object: NDBCNTR (Line: 274) 0x00000006
> Program: /usr/bin/ndbd
> Pid: 31222
> Version: mysql-5.1.47 ndb-7.1.5
> Trace: /var/lib/mysql-cluster/ndb_4_trace.log.7
> ***EOM***
>
> 2010-09-30 11:39:27 [ndbd] ALERT    -- Node 4: Forced node shutdown 
> completed. Caused by error 2303: 'System error, node killed during 
> node restart by other node(Internal error, programming error or 
> missing error message, please report a bug). Temporary error, restart 
> node'.
> ---------------------------------
>
> Doing some research I found this regarding the GCP stop errors: 
>
> http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#mysql-cluster-ndbd-definition-gcp-stop-errors
> 
>
>
> And I also found this: 
>
> http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#ndbparam-ndbd-diskiothreadpool
> 
>
>
>
> Unforutnately due to an extremely tight deadline and having to put the 
> server in production 6 hours ago, we had to revert back to the more 
> stable config of not using a tablespace file. This however means we 
> are limited by the amount of data we can put in our tables.
> I have a few questions, if anyone can help?
>
> * Is there any other command other than "all report mem" for checking 
> the actual amount of RAM being used of a table? We have noticed that 
> we have received table full when we have plenty of RAM left, but no 
> tablespace.  However we don't know how to check to see when we are 
> reaching that point.
>
> * Any ideas on how to fix the issue with NDB  crashing when we add a 
> larger tablespace file?  And without causing any downtime while doing 
> it? There are some things about DiskIOThreadPool, however I'm not sure 
> what our ideal values would be for our dataset, etc.
>
> Thanks!
> Aaron
>
> Karl Kloppenborg wrote:
>> Hi guys,
>>
>> In light of the continual problems we face, Aaron will be taking over 
>> this so I can continue testing...
>>
>> Thanks for helping guys!
>>
>> Cheers,
>> Karl Kloppenborg.
>>
>>
>> On 30/09/2010, at 17:31, Johan Andersson wrote:
>>
>>  
>>> Hi,
>>>
>>> You are running out of extents --> add data file
>>> Then if you are going to load in a lot of records (>100M) you need 
>>> to add max_rows too.
>>> max_rows will underneath make the data nodes to create more fragments.
>>> With many rows, more fragments than the default are needed to store 
>>> the primary key hash index.
>>>
>>> BR
>>> johan
>>>
>>>
>>> Karl Kloppenborg wrote:
>>>    
>>>> Hi Johan,
>>>>
>>>> Can you please explain that?
>>>>
>>>> I am a bit confused....
>>>>
>>>> Cheers,
>>>> Karl.
>>>> On 30/09/2010, at 16:51, Johan Andersson wrote:
>>>>
>>>>  
>>>>      
>>>>> Karl,
>>>>> Just to summarize:
>>>>> - You must do ALTER TS ADD DATAFILE _and_ set max_rows.
>>>>>
>>>>> BR
>>>>> johan
>>>>>
>>>>> Jonas Oreland wrote:
>>>>>           
>>>>>> On 09/30/10 07:27, Karl Kloppenborg wrote:
>>>>>>                
>>>>>>> Hey Jonas,
>>>>>>>
>>>>>>> Thanks for the reply, we will try implement the Max_rows
> after 
>>>>>>> the rebuild takes place, However I have a few questions:
>>>>>>>
>>>>>>> 1) after doing some writes and getting the table full I
> executed 
>>>>>>> the show warnings:
>>>>>>>
>>>>>>>
> +-------+------+--------------------------------------------------------+ 
>>>>>>>
>>>>>>> | Level | Code | 
>>>>>>> Message                                                |
>>>>>>>
> +-------+------+--------------------------------------------------------+ 
>>>>>>>
>>>>>>> | Error | 1296 | Got error 1601 'Out extents, tablespace
> full' 
>>>>>>> from NDB |
>>>>>>> | Error | 1114 | The table 'my_ndb_awesome_large_table' is 
>>>>>>> full                   |
>>>>>>>
> +-------+------+--------------------------------------------------------+ 
>>>>>>>
>>>>>>> 2 rows in set (0.00 sec)
>>>>>>>                       
>>>>>> 1601 means data-on-disk if i'm not mistaken.
>>>>>>
>>>>>> ndbd's tablespaces doesn't auto grow.
>>>>>> "alter tablespace X add datafile Y initial_size=10G" or something
> 
>>>>>> should do it.
>>>>>>
>>>>>> /Jonas
>>>>>>
>>>>>>                
>>>>>>> Could you explain this?
>>>>>>>
>>>>>>> 2) I will try implementing max_rows..
>>>>>>>
>>>>>>> 3) we are not using ndbmtd but each server has 16 cores /
> 144GB  
>>>>>>> should we?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Karl Kloppenborg.
>>>>>>>
>>>>>>> On 30/09/2010, at 15:19, Jonas Oreland wrote:
>>>>>>>
>>>>>>>                      
>>>>>>>> On 09/30/10 07:13, Karl Kloppenborg wrote:
>>>>>>>>                            
>>>>>>>>> Hi guys,
>>>>>>>>>
>>>>>>>>> We initially didn't but we got this problem at 92
> million rows,
>>>>>>>>>
>>>>>>>>> After a lot of research we found a post that stated
> maybe 
>>>>>>>>> increasing the max rows would help, however after
> reading more 
>>>>>>>>> on max_rows with NDB it was found that it is not used
> by 
>>>>>>>>> NDBcluster engine and ignored? is this true?
>>>>>>>>>                                   
>>>>>>>> 1) after you get "table full", issue "show warnings",
> this will 
>>>>>>>> show you exact error code
>>>>>>>>
>>>>>>>> 2) maxrows *should* help
>>>>>>>>
>>>>>>>> 3) are you using ndbmtd ?
>>>>>>>>
>>>>>>>> /Jonas
>>>>>>>>
>>>>>>>>                            
>>>>>>>>> However take note, that we require that it hold 600
> million 
>>>>>>>>> rows... challenge..
>>>>>>>>>
>>>>>>>>> I will also add my create table syntax to show you
> what we're 
>>>>>>>>> doing.
>>>>>>>>>
>>>>>>>>> CREATE TABLE `my_ndb_awesome_large_table ` (
>>>>>>>>> `user_id` int(4) NOT NULL,
>>>>>>>>> `description` varchar(50) NOT NULL,
>>>>>>>>> `type` varchar(64) NOT NULL,
>>>>>>>>> `count` int(4) NOT NULL,
>>>>>>>>> `after` int(3) NOT NULL,
>>>>>>>>> `active` int(1) NOT NULL,
>>>>>>>>> `lastactivity` timestamp NOT NULL DEFAULT
> CURRENT_TIMESTAMP ON 
>>>>>>>>> UPDATE CURRENT_TIMESTAMP
>>>>>>>>> )  ENGINE=ndbcluster DEFAULT CHARSET=utf8
>>>>>>>>>
>>>>>>>>> Any thoughts on what this virtual "level" might be?
> because as 
>>>>>>>>> you can see in my last email have not run out of
> index / data 
>>>>>>>>> space?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 30/09/2010, at 14:59, Jonas Oreland wrote:
>>>>>>>>>
>>>>>>>>>                                  
>>>>>>>>>> Are you using "maxrows" in your table definition
> ?
>>>>>>>>>>
>>>>>>>>>> /Jonas
>>>>>>>>>>
>>>>>>>>>> On 09/30/10 06:15, Karl Kloppenborg wrote:
>>>>>>>>>>                                        
>>>>>>>>>>> Hi,
>>>>>>>>>>> We have setup a MySQL cluster (pretty
> standard two NDB nodes 
>>>>>>>>>>> + two management servers)
>>>>>>>>>>> 2x cluster processing nodes (running the
> MySQL NDB daemon + 
>>>>>>>>>>> MySQL server)
>>>>>>>>>>>   - 144GB ram
>>>>>>>>>>>   -  8x 300GB SAS - Raid 10     -
> Data-Storage = 135GB 
>>>>>>>>>>> Ram     - Index-Storage = 5GB RAM
>>>>>>>>>>> However at 92Million rows in a table, it is
> returning the 
>>>>>>>>>>> TableFull error?
>>>>>>>>>>>
>>>>>>>>>>> My config is as follows:
>>>>>>>>>>> [NDBD DEFAULT]
>>>>>>>>>>> NoOfReplicas=2
>>>>>>>>>>> LockPagesInMainMemory=1
>>>>>>>>>>>
>>>>>>>>>>> DataMemory=131G
>>>>>>>>>>> IndexMemory=10G
>>>>>>>>>>>
>>>>>>>>>>> TimeBetweenLocalCheckpoints=6
>>>>>>>>>>> NoOfFragmentLogFiles=500
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [MYSQLD DEFAULT]
>>>>>>>>>>>
>>>>>>>>>>> [NDB_MGMD DEFAULT]
>>>>>>>>>>>
>>>>>>>>>>> [TCP DEFAULT]
>>>>>>>>>>> SendBufferMemory=8M
>>>>>>>>>>> ReceiveBufferMemory=8M
>>>>>>>>>>>
>>>>>>>>>>> # Section for the cluster management node
>>>>>>>>>>> [NDB_MGMD]
>>>>>>>>>>> ID=1 #LB1 ID is 1
>>>>>>>>>>> Datadir=/var/lib/mysql-cluster
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF LB1
>>>>>>>>>>>
>>>>>>>>>>> [NDB_MGMD]
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF LB2
>>>>>>>>>>> ID=2 #ID of LB2 is 2
>>>>>>>>>>> Datadir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> # Section for the storage nodes
>>>>>>>>>>> [NDBD]
>>>>>>>>>>> # IP address of the first storage node
>>>>>>>>>>> HostName=#.#.#.# # PRIVATE IP OF DB1
>>>>>>>>>>> DataDir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> [NDBD]
>>>>>>>>>>> # IP address of the second storage node
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF DB2
>>>>>>>>>>> DataDir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> # one [MYSQLD] per storage node
>>>>>>>>>>> [MYSQLD]
>>>>>>>>>>> [MYSQLD]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Can anyone please shed some light on this
> matter?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Karl Kloppenborg --
>>>>>>>>>>> MySQL Cluster Mailing List
>>>>>>>>>>> For list archives:
> http://lists.mysql.com/cluster
>>>>>>>>>>> To unsubscribe:    
>>>>>>>>>>> http://lists.mysql.com/cluster?unsub=1
>>>>>>>>>>>
>>>>>>>>>>>                                              
> 
>>>>>>> -- 
>>>>>>> MySQL Cluster Mailing List
>>>>>>> For list archives: http://lists.mysql.com/cluster
>>>>>>> To unsubscribe:    
>>>>>>> http://lists.mysql.com/cluster?unsub=1
>>>>>>>
>>>>>>>                       
>>>>>>                 
>>>> -- 
>>>> MySQL Cluster Mailing List
>>>> For list archives: http://lists.mysql.com/cluster
>>>> To unsubscribe:    
>>>> http://lists.mysql.com/cluster?unsub=1
>>>>
>>>>  
>>>>       
>>
>>
>>   
>
>

Thread
MySQL Cluster -- Table FullKarl Kloppenborg30 Sep
  • Re: MySQL Cluster -- Table FullAdam Dixon30 Sep
    • Re: MySQL Cluster -- Table FullKarl Kloppenborg30 Sep
  • Re: MySQL Cluster -- Table FullJonas Oreland30 Sep
    • Re: MySQL Cluster -- Table FullKarl Kloppenborg30 Sep
      • Re: MySQL Cluster -- Table FullJonas Oreland30 Sep
        • Re: MySQL Cluster -- Table FullKarl Kloppenborg30 Sep
          • Re: MySQL Cluster -- Table FullJonas Oreland30 Sep
            • Re: MySQL Cluster -- Table FullJohan Andersson30 Sep
              • Re: MySQL Cluster -- Table FullKarl Kloppenborg30 Sep
                • Re: MySQL Cluster -- Table FullJohan Andersson30 Sep
                  • Re: MySQL Cluster -- Table FullKarl Kloppenborg1 Oct
                    • Re: MySQL Cluster -- Table FullAaron Weller // Crucial Paradigm1 Oct
                      • Re: MySQL Cluster -- Table FullJohan Andersson1 Oct
                        • Re: MySQL Cluster -- Table FullAaron Weller // Crucial Paradigm7 Oct
                      • Re: MySQL Cluster -- Table FullJohan Andersson1 Oct
                      • Re: MySQL Cluster -- Table FullJon Stephens1 Oct
  • Re: MySQL Cluster -- Table FullMoses28 Apr
    • Re: MySQL Cluster -- Table FullJohan Andersson28 Apr
  • Re: MySQL Cluster -- Table FullMoses6 May
Re: MySQL Cluster -- Table FullAaron Weller // Crucial Paradigm1 Oct
  • Re: MySQL Cluster -- Table FullJohan Andersson1 Oct