Hi!
>>>>> "Sasha" == Sasha Pachev <sasha@stripped> writes:
Sasha> "Yury V. Bukhman" wrote:
>>
>> Hi all!
>>
>> I am relatively new to MySQL: have been using it for a few months. The
>> application I am developing must scan databases of flat text files
>> (about 10,000 - 100,000 entries in separate files or a single large
>> file) into a relational database. I am writing perl programs that parse
>> the text files and use DBI to talk to MySQL and insert data into a set
>> of database tables. For a large set of texts the process can take many
>> hours. What are the ways to speed it up? I lock my tables. I do not
>> want to use multiple inserts, because a failure to insert one entry
>> should produce an error, but it should not prevent insertion of the
>> nearby entries. Would it help to drop all indexes before populating the
>> database and rebuild them afterwards? Would INSERT DELAYED help? Are
>> there any other tricks? Is there a way to test the speed of INSERT
>> statements quickly? I try to read manuals and mailing list archives,
>> but often my general ignorance gets in the way of fully understanding
>> them. I would appreciate any help.
>>
>> Yury
Sasha> - make sure that your data is clean before you do an insert -- that way
Sasha> you will never get an error trying to insert unless there is something
Sasha> wrong with your tables , connections, mysqld, etc ..
Sasha> - droping indeces will help quite a bit
The manual describes how you can drop indexes and create them very
fast later with isamchk..
Sasha> - not sure if INSERT DELAYED will be very much help if you are already
Sasha> locking the tables, and doing multiple inserts -- any thoughts, Monty?
If you are locking the table, neither INSERT DELAYED or multiple row
inserts will help that much.
Sasha> - run top as you are importing the data, and see how CPU time is
Sasha> distributed -- if your Perl script is using a significant portion of the
Sasha> CPU, re-write it in C or C++.
Sasha> - benchmarking is quite easy -- just time several thousand inserts and
Sasha> compute how many of them you do per second in your program.
To get more speed:
- If you have many indexes, increase the key buffer.
- If possible, use LOAD DATA INFILE.
- Create indexes at a later stage (see above)
Regards,
Monty