At 01:20 PM 4/14/2010, Carsten Pedersen wrote:
>Been there, done that. It's a maintenance nightmare.
Why is it a maintenance nightmare? I've been using this technique for a
couple of years to store large amounts of data and it has been working just
fine. I have each table representing one year of data and I can go back and
access 30 years worth of data using either the individual table or a merge
table. Loading data is also faster than trying to store tens of millions of
rows of data into one table because the index may get too unbalanced.
Your method is preferred if there is no way to separate the data into
distinct tables. But with the example provided, it appeared (at least to
me), he was saving just one days worth of data. If that's the case, my
method would take only milliseconds to get rid of the old data. Your
suggestion of flagging the rows as deleted and then deleting them later
requires more work for the database and there needs to be a lull in order
to delete the data. He will also have to optimize the table to get rid of
the deleted rows and this requires the table to be locked.
Like I said, both methods will work. It depends on how the data is
organized and how much down time he can dedicate to the process in order to
determine which process is the best best .
>Another idea: Have a separate "deleted" table with the IDs of the rows
>that you consider deleted. Re-write your queries to do a
>left-join-not-in-the-other-table agains the "delete" table. Then, either
>wait for a maintenance window to delete the rows both the original table
>and the "delete" table. Or remove just a few rows at a time.
>The "deleted" table can be created with just a read lock on the original
>table, and since it's going to be fairly small, the impact of stuffing it
>with data is not going to be great.
>It's a bit of a hassle to set up, but once done you don't have to worry
>about creating and deleting tables every day.
>>It looks like you only want to keep the current data, perhaps the current
>>day's worth, and delete the old data.
>>I would store the data in separate MySIAM tables, each table would
>>represent a date, like D20100413 and D20100414. Your program will decide
>>which table to insert the data into by creating a current date variable
>>and now all data gets inserted to the table named by that variable. When
>>the older data is no longer needed, just drop the table. If you want to
>>keep the last 7 days of data, create a merge table of the last 7 tables.
>>When you drop the oldest table, redefine the merge table. You can
>>accomplish all this in just milliseconds.
>>At 08:08 AM 4/14/2010, you wrote:
>>>I am using MySQL version 4.1.12-log. All the databases on it are using
>>>MyISAM database engine.
>>>Every day, I delete almost 90000 rows on a table of 3 153 916 rows.
>>>To delete the rows, I use a request like this : "DELETE QUICK FROM [table]
>>>WHERE [column] < '2010-04-13 00:00:00' LIMIT 7500". I execute this request
>>>until all the rows are delete.
>>>This works but when I run the request, I can't access to the database (make
>>>INSERT and SELECT requests) during I do the DELETE.
>>>How can I do a "DELETE" without impact on INSERT and SELECT requests done on
>>>the same time?