>>>>> "Bob" == Bob Bowker <bowker@stripped> writes:
Bob> Hi --
Bob> We have a client on a dedicated PIII/450/256 RAM, php3, mysql ... the data
Bob> we use is issued as 2 zipped text files every 7 days - about 250,000
Bob> fixed-length records. The problem I'm facing is how to get the data into
Bob> MySQL tables with any kind of efficiency ... and automatically, as
Bob> untouched by human hands as possible ...
Bob> One major fact-of-life: the Apache setup on the server is fixed - even
Bob> though it's a dedicated box, the ISP has standardized the setup, and
Bob> doesn't accommodate individual peculiarities. As I said, for now, a
Bob> fact-of-life ...
Bob> We created a php3 script which unzipped the 2 files, then read them in one
Bob> line at a time, breaking out each record into the fields we need, and doing
Bob> an INSERT into the MySQL tables. The system handled about 8000 lines, then
Bob> timed out. We put an 8000 line limit on things, and could probably chain
Bob> several executions of the script, but that seems kludgy, and certainly
Bob> leaves the door open for a lot of things to go wrong.
Did you try to use multi-line inserts; This is MUCH faster than
doing a lot of single line inserts.
Bob> To get around the time-out problem, we're considering a series of php
Bob> scripts each of which hands a single instruction to the MySQL engine then
Bob> terminates, with cron jobs doing checks every x minutes until the previous
Bob> step is done and then issuing their single instruction, etc. Again, it
Bob> seems like a series of problems waiting to happen ...
Bob> Another suggested scenario is to read all the records into a temp table
Bob> using READ DATA INFILE, then executing a single complex MySQL command which
Bob> parses each fixed-length record using SUBSTRING() before INSERTing into the
Bob> main table. Since it's now 2 MySQL commands we could issue using cron
Bob> jobs, we have bypassed the time-out issue completely.
Why do you have to do substrings when inserting this in the main
table? Isn't it easier to have the temporary table of the same format
as the main table ?
Bob> Since the site will be active through all this, we plan to do all this with
Bob> temporary tables, and when it's done, rename the new temp tables to replace
Bob> the production tables. However, we need to make sure that these weekly
Bob> updates (even though we can cron them at 2am Sundays) don't slow the site
Bob> itself to 8086 speeds ...
Bob> One other issue is the index entries on the main table, temp or production
Bob> ... INSERTs take forever if the table is indexed before being populated,
Bob> but adding 1 index to a table of 250,000 records takes an hour. Which
Bob> devil do we dance with?
The MySQL manual includes some tricks you can use to create the indexed
afterwards with isamchk
Another option is to increase your key_buffer a LOT.
A third option (in MySQL 3.23) is to create your table without
compressed key indexes; CREATE TABLE ... pack_keys=0. This makes
index manipulation much faster if you have CHAR keys (at the expensive
In MySQL 3.23.3 you can also use the new FAST delay_key_write=1 table option!
This will only flush key data to disk when the file is closed!
(This means of course that if MySQL crashes, you MUST run myisamchk -r
on this table!)
Bob> I'm sure we're not the only/first ones to face these issues, and I'm hoping
Bob> someone else can share the solutions they have implemented - or ideas, or
Bob> suggestions ...