List:Cluster« Previous MessageNext Message »
From:Johan Andersson Date:May 10 2012 6:14am
Subject:Re: NDBCluster Load Data Infile extremely slow
View as plain text  
Hi Scott,

Linear - that is great to hear! The whole thing about cluster is to
parallelize things in as short and simple jobs as possible.

Best regards
Johan

On Wed, May 9, 2012 at 11:24 PM, Scott Sandler <ssandler@stripped> wrote:

> Hi Johan,****
>
> ** **
>
> Thanks for your reply. The recommendation of loading into innodb and then
> altering the storage engine to ndbcluster sounded promising, but I tested
> it (with the recommended ndb_batch_size=8*1024*1024) and it took about the
> same amount of time as just loading directly into ndb. The table does have
> two text columns but I can’t get rid of those. It doesn’t have any auto
> increment columns or indexes.****
>
> ** **
>
> Splitting using the GNU split utility and then kicking off a bunch of load
> data commands in parallel was definitely a lot faster, in fact I found the
> runtime basically scaled linearly with the number of files even when I went
> beyond the number of CPU cores available on the VMs. I only went up to 7
> different files but I’ll try even more as well. I guess this is the way to
> go if there are no other options that can help speed this up. ****
>
> ** **
>
> Scott****
>
> ** **
>
> *From:* Johan Andersson [mailto:johan@stripped]
> *Sent:* Wednesday, May 09, 2012 4:01 PM
> *To:* Scott Sandler
> *Cc:* cluster@stripped
> *Subject:* Re: NDBCluster Load Data Infile extremely slow****
>
> ** **
>
> Hi,****
>
> ** **
>
> * Split the load data into several files and load in parallel.****
>
> * avoid blob/text if you can****
>
> * set ndb_batch_size****
>
> * if you have auto_increments, ndb_autoincrement_prefetch_sz makes a big
> difference****
>
> See more here about the above:****
>
>
>
> http://johanandersson.blogspot.se/2012/04/mysql-cluster-how-to-load-it-with-data.html
> ****
>
> ** **
>
> Best regards,
> Johan ****
>
> Severalnines AB****
>
> ** **
>
> On Wed, May 9, 2012 at 9:47 PM, Scott Sandler <ssandler@stripped>
> wrote:****
>
> I've found that the performance of LOAD DATA INFILE for an NDBCluster is
> over 130x slower than innodb. I expect it to be slower since it has to
> insert into multiple nodes and push data over the network, but the
> performance difference I'm seeing is 20,000 rows per second for innodb vs.
> 150 rows per second into ndbcluster (with the same schema/hardware/etc.).
> This slowness is quite a big road block in actually being able to migrate
> to MySQL cluster.
>
> Here's a pastebin of my config.ini<http://pastebin.com/JhrjdXKH>, and
> another of my my.cnf<http://pastebin.com/9y1mY7zm>. Are there any
> parameters I can change or anything I can try to speed up the ndb data
> loading?
>
> Thanks,
>
> Scott****
>
> ** **
>

Thread
NDBCluster Load Data Infile extremely slowScott Sandler9 May
  • Re: NDBCluster Load Data Infile extremely slowJohan Andersson9 May
    • RE: NDBCluster Load Data Infile extremely slowScott Sandler9 May
      • Re: NDBCluster Load Data Infile extremely slowJohan Andersson10 May