Yes, the rows are in primary key order, however each row contains
specific integer primary keys; I'm not inserting nulls into a table
where the primary key is auto increment, so I don't see why concurrent
inserts would fight for similar spots (although, I'm admittedly not a
MySQL hotshot, so the basis of my assumption is a *hunch* only).
I'm not sure (yet) if a single-threaded operation would run into an
i/o bottleneck. I didn't run mysqlimport using --use-threads=1 just
yet (will do if I have the time), but when I've ran it with
--use-threads=4 the import (of a ~500 MB dump) took more time than
running for different processes (I've split my tab delimited dumps
with split into four even pieces and imported those in four different
Anyway, it seems that doing a simple import (from a dump, which isn't
tab delimited, but contains complete or extended inserts) takes the
same amount of time than doing a mysqlimport using --use-threads=4 and
as it turns out splitting my tab delimited dump is too complex to
handle gracefully, because my data contains newline characters all
over the place, so I've dropped the idea of this whole mysqlimport
thing for now. (I'll try the method of migrating an InnoDB database to
an NDBCluster described here instead.)
If I have the time I'll write up a bug report, or documentation
enhancement request for this.
Thanks for the input!
On Wed, Jul 25, 2012 at 6:49 PM, Rick James <rjames@stripped> wrote:
> I'm skeptical that use-treads can every be very effective.
> What order are the rows in? They are probably in PRIMARY KEY order, which means that
> the INSERTing threads will be fighting over similar spots in the table.
> Is it I/O bound when it is single-threaded? If so, then there can't be any
> improvement with use-threads.
> Suggest you file a bug with bugs.mysql.com. If nothing else, the documentation
> should say more than it does.
>> -----Original Message-----
>> From: Róbert Kohányi [mailto:kohanyi.robert@stripped]
>> Sent: Tuesday, July 24, 2012 10:52 AM
>> To: mysql@stripped
>> Subject: mysqlimport --use-threads / mysqladmin processlist
>> I'm in the middle of migrating an InnoDB database to an NDBCluster. I
>> use mysqldump to first create two dumps, the first one contains only
>> the database schema, the second one contains only tab delimited data
>> (via mysqldump --tab). I edit my InnoDB schema here and there
>> (ENGINE=InnoDB to ENGINE=NDB, etc.) import it and after this I import
>> the InnoDB data *as is* using mysqlimport.
>> I use it like this:
>> mysqlimport --local --use-threads=4 db dir/*.txt
>> (dir of course cotains the tab delimited data I dumped before.)
>> The import starts, and I check its progress via mysqladmin, like this:
>> mysqladmin --sleep=1 processlist
>> this is what I see: http://pastebin.com/raw.php?i=M23fWVjc
>> Only a single process seems to be loading my data. Is this what I
>> *should* see, or, in my case using 4 threads, should I see four
>> processes? I'm not asking which one will be faster, I'm just simply
>> confused because I don't know what to expect. If I start four different
>> mysqlimport processes, each one importing different files, then I can
>> see four different process in the mysql processlist.
>> If it's matters, here is my server version (I use the official
>> Server version: 5.5.25a-ndb-7.2.7-gpl MySQL Cluster Community Server
>> Kohányi Róbert
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe: http://lists.mysql.com/mysql