From: Johan Andersson Date: May 10 2012 6:14am Subject: Re: NDBCluster Load Data Infile extremely slow List-Archive: http://lists.mysql.com/cluster/8323 Message-Id: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary=90e6ba308f109b2afd04bfa889f7 --90e6ba308f109b2afd04bfa889f7 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hi Scott, Linear - that is great to hear! The whole thing about cluster is to parallelize things in as short and simple jobs as possible. Best regards Johan On Wed, May 9, 2012 at 11:24 PM, Scott Sandler wrote= : > Hi Johan,**** > > ** ** > > Thanks for your reply. The recommendation of loading into innodb and then > altering the storage engine to ndbcluster sounded promising, but I tested > it (with the recommended ndb_batch_size=3D8*1024*1024) and it took about = the > same amount of time as just loading directly into ndb. The table does hav= e > two text columns but I can=92t get rid of those. It doesn=92t have any au= to > increment columns or indexes.**** > > ** ** > > Splitting using the GNU split utility and then kicking off a bunch of loa= d > data commands in parallel was definitely a lot faster, in fact I found th= e > runtime basically scaled linearly with the number of files even when I we= nt > beyond the number of CPU cores available on the VMs. I only went up to 7 > different files but I=92ll try even more as well. I guess this is the way= to > go if there are no other options that can help speed this up. **** > > ** ** > > Scott**** > > ** ** > > *From:* Johan Andersson [mailto:johan@stripped] > *Sent:* Wednesday, May 09, 2012 4:01 PM > *To:* Scott Sandler > *Cc:* cluster@stripped > *Subject:* Re: NDBCluster Load Data Infile extremely slow**** > > ** ** > > Hi,**** > > ** ** > > * Split the load data into several files and load in parallel.**** > > * avoid blob/text if you can**** > > * set ndb_batch_size**** > > * if you have auto_increments, ndb_autoincrement_prefetch_sz makes a big > difference**** > > See more here about the above:**** > > > http://johanandersson.blogspot.se/2012/04/mysql-cluster-how-to-load-it-wi= th-data.html > **** > > ** ** > > Best regards, > Johan **** > > Severalnines AB**** > > ** ** > > On Wed, May 9, 2012 at 9:47 PM, Scott Sandler > wrote:**** > > I've found that the performance of LOAD DATA INFILE for an NDBCluster is > over 130x slower than innodb. I expect it to be slower since it has to > insert into multiple nodes and push data over the network, but the > performance difference I'm seeing is 20,000 rows per second for innodb vs= . > 150 rows per second into ndbcluster (with the same schema/hardware/etc.). > This slowness is quite a big road block in actually being able to migrate > to MySQL cluster. > > Here's a pastebin of my config.ini, and > another of my my.cnf. Are there any > parameters I can change or anything I can try to speed up the ndb data > loading? > > Thanks, > > Scott**** > > ** ** > --90e6ba308f109b2afd04bfa889f7--