Hi,
and also check
http://jcole.us/blog/archives/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
BR
johan
Aaron Weller // Crucial Paradigm wrote:
> Hey Johan and Jonas,
>
> Thanks for your help so far! I've been working with Karl on these
> issues.
>
> We ended up taking your suggestions and did the following:
>
> * Ran NDB initial on both nodes
> * Re-created tablespace using a single 500GB file
> * Re-created undo log using a single 50GB file
> * Set max rows to 650 million
>
> When we started restoring databases of anything larger than a few
> thousand records, or when we used mysqlslap to hit the database with a
> decent number of queries after a few seconds it would crash and we
> would get an error like this in the logs:
>
> ---------------------------------
> 2010-09-30 11:39:22 [ndbd] INFO -- Node 4 killed this node because
> GCP stop was detected
> 2010-09-30 11:39:22 [ndbd] INFO -- NDBCNTR (Line: 274) 0x00000006
> 2010-09-30 11:39:22 [ndbd] INFO -- Error handler shutting down system
> 2010-09-30 11:39:22 [ndbd] INFO -- Error handler shutdown
> completed - exiting
> Time: Thursday 30 September 2010 - 11:39:22
> Status: Temporary error, restart node
> Message: System error, node killed during node restart by other node
> (Internal error, programming error or missing error message, please
> report a bug)
> Error: 2303
> Error data: Node 4 killed this node because GCP stop was detected
> Error object: NDBCNTR (Line: 274) 0x00000006
> Program: /usr/bin/ndbd
> Pid: 31222
> Version: mysql-5.1.47 ndb-7.1.5
> Trace: /var/lib/mysql-cluster/ndb_4_trace.log.7
> ***EOM***
>
> 2010-09-30 11:39:27 [ndbd] ALERT -- Node 4: Forced node shutdown
> completed. Caused by error 2303: 'System error, node killed during
> node restart by other node(Internal error, programming error or
> missing error message, please report a bug). Temporary error, restart
> node'.
> ---------------------------------
>
> Doing some research I found this regarding the GCP stop errors:
>
> http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#mysql-cluster-ndbd-definition-gcp-stop-errors
>
>
>
> And I also found this:
>
> http://dev.mysql.com/doc/refman/5.1/en/mysql-cluster-ndbd-definition.html#ndbparam-ndbd-diskiothreadpool
>
>
>
>
> Unforutnately due to an extremely tight deadline and having to put the
> server in production 6 hours ago, we had to revert back to the more
> stable config of not using a tablespace file. This however means we
> are limited by the amount of data we can put in our tables.
> I have a few questions, if anyone can help?
>
> * Is there any other command other than "all report mem" for checking
> the actual amount of RAM being used of a table? We have noticed that
> we have received table full when we have plenty of RAM left, but no
> tablespace. However we don't know how to check to see when we are
> reaching that point.
>
> * Any ideas on how to fix the issue with NDB crashing when we add a
> larger tablespace file? And without causing any downtime while doing
> it? There are some things about DiskIOThreadPool, however I'm not sure
> what our ideal values would be for our dataset, etc.
>
> Thanks!
> Aaron
>
> Karl Kloppenborg wrote:
>> Hi guys,
>>
>> In light of the continual problems we face, Aaron will be taking over
>> this so I can continue testing...
>>
>> Thanks for helping guys!
>>
>> Cheers,
>> Karl Kloppenborg.
>>
>>
>> On 30/09/2010, at 17:31, Johan Andersson wrote:
>>
>>
>>> Hi,
>>>
>>> You are running out of extents --> add data file
>>> Then if you are going to load in a lot of records (>100M) you need
>>> to add max_rows too.
>>> max_rows will underneath make the data nodes to create more fragments.
>>> With many rows, more fragments than the default are needed to store
>>> the primary key hash index.
>>>
>>> BR
>>> johan
>>>
>>>
>>> Karl Kloppenborg wrote:
>>>
>>>> Hi Johan,
>>>>
>>>> Can you please explain that?
>>>>
>>>> I am a bit confused....
>>>>
>>>> Cheers,
>>>> Karl.
>>>> On 30/09/2010, at 16:51, Johan Andersson wrote:
>>>>
>>>>
>>>>
>>>>> Karl,
>>>>> Just to summarize:
>>>>> - You must do ALTER TS ADD DATAFILE _and_ set max_rows.
>>>>>
>>>>> BR
>>>>> johan
>>>>>
>>>>> Jonas Oreland wrote:
>>>>>
>>>>>> On 09/30/10 07:27, Karl Kloppenborg wrote:
>>>>>>
>>>>>>> Hey Jonas,
>>>>>>>
>>>>>>> Thanks for the reply, we will try implement the Max_rows
> after
>>>>>>> the rebuild takes place, However I have a few questions:
>>>>>>>
>>>>>>> 1) after doing some writes and getting the table full I
> executed
>>>>>>> the show warnings:
>>>>>>>
>>>>>>>
> +-------+------+--------------------------------------------------------+
>>>>>>>
>>>>>>> | Level | Code |
>>>>>>> Message |
>>>>>>>
> +-------+------+--------------------------------------------------------+
>>>>>>>
>>>>>>> | Error | 1296 | Got error 1601 'Out extents, tablespace
> full'
>>>>>>> from NDB |
>>>>>>> | Error | 1114 | The table 'my_ndb_awesome_large_table' is
>>>>>>> full |
>>>>>>>
> +-------+------+--------------------------------------------------------+
>>>>>>>
>>>>>>> 2 rows in set (0.00 sec)
>>>>>>>
>>>>>> 1601 means data-on-disk if i'm not mistaken.
>>>>>>
>>>>>> ndbd's tablespaces doesn't auto grow.
>>>>>> "alter tablespace X add datafile Y initial_size=10G" or something
>
>>>>>> should do it.
>>>>>>
>>>>>> /Jonas
>>>>>>
>>>>>>
>>>>>>> Could you explain this?
>>>>>>>
>>>>>>> 2) I will try implementing max_rows..
>>>>>>>
>>>>>>> 3) we are not using ndbmtd but each server has 16 cores /
> 144GB
>>>>>>> should we?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> Karl Kloppenborg.
>>>>>>>
>>>>>>> On 30/09/2010, at 15:19, Jonas Oreland wrote:
>>>>>>>
>>>>>>>
>>>>>>>> On 09/30/10 07:13, Karl Kloppenborg wrote:
>>>>>>>>
>>>>>>>>> Hi guys,
>>>>>>>>>
>>>>>>>>> We initially didn't but we got this problem at 92
> million rows,
>>>>>>>>>
>>>>>>>>> After a lot of research we found a post that stated
> maybe
>>>>>>>>> increasing the max rows would help, however after
> reading more
>>>>>>>>> on max_rows with NDB it was found that it is not used
> by
>>>>>>>>> NDBcluster engine and ignored? is this true?
>>>>>>>>>
>>>>>>>> 1) after you get "table full", issue "show warnings",
> this will
>>>>>>>> show you exact error code
>>>>>>>>
>>>>>>>> 2) maxrows *should* help
>>>>>>>>
>>>>>>>> 3) are you using ndbmtd ?
>>>>>>>>
>>>>>>>> /Jonas
>>>>>>>>
>>>>>>>>
>>>>>>>>> However take note, that we require that it hold 600
> million
>>>>>>>>> rows... challenge..
>>>>>>>>>
>>>>>>>>> I will also add my create table syntax to show you
> what we're
>>>>>>>>> doing.
>>>>>>>>>
>>>>>>>>> CREATE TABLE `my_ndb_awesome_large_table ` (
>>>>>>>>> `user_id` int(4) NOT NULL,
>>>>>>>>> `description` varchar(50) NOT NULL,
>>>>>>>>> `type` varchar(64) NOT NULL,
>>>>>>>>> `count` int(4) NOT NULL,
>>>>>>>>> `after` int(3) NOT NULL,
>>>>>>>>> `active` int(1) NOT NULL,
>>>>>>>>> `lastactivity` timestamp NOT NULL DEFAULT
> CURRENT_TIMESTAMP ON
>>>>>>>>> UPDATE CURRENT_TIMESTAMP
>>>>>>>>> ) ENGINE=ndbcluster DEFAULT CHARSET=utf8
>>>>>>>>>
>>>>>>>>> Any thoughts on what this virtual "level" might be?
> because as
>>>>>>>>> you can see in my last email have not run out of
> index / data
>>>>>>>>> space?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 30/09/2010, at 14:59, Jonas Oreland wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Are you using "maxrows" in your table definition
> ?
>>>>>>>>>>
>>>>>>>>>> /Jonas
>>>>>>>>>>
>>>>>>>>>> On 09/30/10 06:15, Karl Kloppenborg wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>> We have setup a MySQL cluster (pretty
> standard two NDB nodes
>>>>>>>>>>> + two management servers)
>>>>>>>>>>> 2x cluster processing nodes (running the
> MySQL NDB daemon +
>>>>>>>>>>> MySQL server)
>>>>>>>>>>> - 144GB ram
>>>>>>>>>>> - 8x 300GB SAS - Raid 10 -
> Data-Storage = 135GB
>>>>>>>>>>> Ram - Index-Storage = 5GB RAM
>>>>>>>>>>> However at 92Million rows in a table, it is
> returning the
>>>>>>>>>>> TableFull error?
>>>>>>>>>>>
>>>>>>>>>>> My config is as follows:
>>>>>>>>>>> [NDBD DEFAULT]
>>>>>>>>>>> NoOfReplicas=2
>>>>>>>>>>> LockPagesInMainMemory=1
>>>>>>>>>>>
>>>>>>>>>>> DataMemory=131G
>>>>>>>>>>> IndexMemory=10G
>>>>>>>>>>>
>>>>>>>>>>> TimeBetweenLocalCheckpoints=6
>>>>>>>>>>> NoOfFragmentLogFiles=500
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> [MYSQLD DEFAULT]
>>>>>>>>>>>
>>>>>>>>>>> [NDB_MGMD DEFAULT]
>>>>>>>>>>>
>>>>>>>>>>> [TCP DEFAULT]
>>>>>>>>>>> SendBufferMemory=8M
>>>>>>>>>>> ReceiveBufferMemory=8M
>>>>>>>>>>>
>>>>>>>>>>> # Section for the cluster management node
>>>>>>>>>>> [NDB_MGMD]
>>>>>>>>>>> ID=1 #LB1 ID is 1
>>>>>>>>>>> Datadir=/var/lib/mysql-cluster
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF LB1
>>>>>>>>>>>
>>>>>>>>>>> [NDB_MGMD]
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF LB2
>>>>>>>>>>> ID=2 #ID of LB2 is 2
>>>>>>>>>>> Datadir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> # Section for the storage nodes
>>>>>>>>>>> [NDBD]
>>>>>>>>>>> # IP address of the first storage node
>>>>>>>>>>> HostName=#.#.#.# # PRIVATE IP OF DB1
>>>>>>>>>>> DataDir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> [NDBD]
>>>>>>>>>>> # IP address of the second storage node
>>>>>>>>>>> HostName=#.#.#.# #PRIVATE IP OF DB2
>>>>>>>>>>> DataDir=/var/lib/mysql-cluster
>>>>>>>>>>>
>>>>>>>>>>> # one [MYSQLD] per storage node
>>>>>>>>>>> [MYSQLD]
>>>>>>>>>>> [MYSQLD]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Can anyone please shed some light on this
> matter?
>>>>>>>>>>>
>>>>>>>>>>> Cheers,
>>>>>>>>>>> Karl Kloppenborg --
>>>>>>>>>>> MySQL Cluster Mailing List
>>>>>>>>>>> For list archives:
> http://lists.mysql.com/cluster
>>>>>>>>>>> To unsubscribe:
>>>>>>>>>>> http://lists.mysql.com/cluster?unsub=1
>>>>>>>>>>>
>>>>>>>>>>>
>
>>>>>>> --
>>>>>>> MySQL Cluster Mailing List
>>>>>>> For list archives: http://lists.mysql.com/cluster
>>>>>>> To unsubscribe:
>>>>>>> http://lists.mysql.com/cluster?unsub=1
>>>>>>>
>>>>>>>
>>>>>>
>>>> --
>>>> MySQL Cluster Mailing List
>>>> For list archives: http://lists.mysql.com/cluster
>>>> To unsubscribe:
>>>> http://lists.mysql.com/cluster?unsub=1
>>>>
>>>>
>>>>
>>
>>
>>
>
>