List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:September 16 1999 8:10pm
Subject:Need Procedure Suggestions
View as plain text  
Hi!

>>>>> "Bob" == Bob Bowker <bowker@stripped> writes:

Bob> Hi --
Bob> We have a client on a dedicated PIII/450/256 RAM, php3, mysql ... the data 
Bob> we use is issued as 2 zipped text files every 7 days - about 250,000 
Bob> fixed-length records.  The problem I'm facing is how to get the data into 
Bob> MySQL tables with any kind of efficiency ... and automatically, as 
Bob> untouched by human hands as possible ...

Bob> One major fact-of-life: the Apache setup on the server is fixed - even 
Bob> though it's a dedicated box, the ISP has standardized the setup, and 
Bob> doesn't accommodate individual peculiarities.  As I said, for now, a 
Bob> fact-of-life ...

Bob> We created a php3 script which unzipped the 2 files, then read them in one 
Bob> line at a time, breaking out each record into the fields we need, and doing 
Bob> an INSERT into the MySQL tables.  The system handled about 8000 lines, then 
Bob> timed out.  We put an 8000 line limit on things, and could probably chain 
Bob> several executions of the script, but that seems kludgy, and certainly 
Bob> leaves the door open for a lot of things to go wrong.

Did you try to use multi-line inserts;  This is MUCH faster than
doing a lot of single line inserts.

Bob> To get around the time-out problem, we're considering a series of php 
Bob> scripts each of which hands a single instruction to the MySQL engine then 
Bob> terminates, with cron jobs doing checks every x minutes until the previous 
Bob> step is done and then issuing their single instruction, etc.  Again, it 
Bob> seems like a series of problems waiting to happen ...

Bob> Another suggested scenario is to read all the records into a temp table 
Bob> using READ DATA INFILE, then executing a single complex MySQL command which 
Bob> parses each fixed-length record using SUBSTRING() before INSERTing into the 
Bob> main table.  Since it's now 2 MySQL commands we could issue using cron 
Bob> jobs, we have bypassed the time-out issue completely.

Why do you have to do substrings when inserting this in the main
table?  Isn't it easier to have the temporary table of the same format 
as the main table ?

Bob> Since the site will be active through all this, we plan to do all this with 
Bob> temporary tables, and when it's done, rename the new temp tables to replace 
Bob> the production tables.  However, we need to make sure that these weekly 
Bob> updates (even though we can cron them at 2am Sundays) don't slow the site 
Bob> itself to 8086 speeds ...

Bob> One other issue is the index entries on the main table, temp or production 
Bob> ... INSERTs take forever if the table is indexed before being populated, 
Bob> but adding 1 index to a table of 250,000 records takes an hour.  Which 
Bob> devil do we dance with?

The MySQL manual includes some tricks you can use to create the indexed
afterwards with isamchk

Another option is to increase your key_buffer a LOT.

A third option (in MySQL 3.23) is to create your table without
compressed key indexes;  CREATE TABLE ... pack_keys=0.  This makes
index manipulation much faster if you have CHAR keys (at the expensive 
of disk).

In MySQL 3.23.3 you can also use the new FAST delay_key_write=1 table option!
This will only flush key data to disk when the file is closed!
(This means of course that if MySQL crashes, you MUST run myisamchk -r
on this table!)

Bob> I'm sure we're not the only/first ones to face these issues, and I'm hoping 
Bob> someone else can share the solutions they have implemented - or ideas, or 
Bob> suggestions ...

Regards,
Monty
Thread
Need Procedure SuggestionsBob Bowker13 Sep
  • Re: Need Procedure SuggestionsJames Manning13 Sep
  • Need Procedure SuggestionsMichael Widenius17 Sep
Re: Need Procedure SuggestionsJames Manning13 Sep