From: Rick James Date: October 25 2012 8:54pm Subject: RE: Help with purging old logs for each customer ID List-Archive: http://lists.mysql.com/mysql/228491 Message-Id: <2E7DD7ADE53B044C8C8BCD9C5829E1EB1490E79E52@SP2-EX07VS01.ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable If the 90 days is back from MAX(created_on) for a given customer... INDEX(customer_id, created_on) will probably be needed. And that should replace KEY `customers_id` (`cust= omer_id`). Maybe... DELETE FROM customers_log AS a WHERE a.customer_log_id >=3D @left_off AND a.customer_log_id < @z AND a.created_on <=20 ( SELECT MAX(created_on) FROM customers_log WHERE customer_id =3D a.customer_id ) - INTERVAL 90 DAY ); (Since this has the subquery, I would do only 100 at a time, not 1000) Or... CREATE TEMPORARY TABLE tmp SELECT customer_id, MAX(created_on) - INTERVAL 90 DAY AS cutoff FROM customers_log GROUP BY customer_id; DELETE FROM customers_log AS a JOIN tmp ON a.customer_id =3D tmp.customer_id WHERE a.customer_log_id >=3D @left_off AND a.customer_log_id < @z AND a.created_on < tmp.cutoff; If you have millions of rows, a delete without some kind of loop is asking = for trouble. Or... Turning things around to base it on customers... Loop through customer_ids (yeah, you did not want to do this) SELECT @id :=3D 0; StartLoop: SELECT @id :=3D customer_id WHERE customer_id > @id ORDER BY customer_i= d LIMIT 1; if @id is NULL, exit DELETE FROM customers_log AS a WHERE a.customer_id =3D @id AND a.created_on <=20 ( SELECT MAX(created_on) FROM customers_log WHERE customer_id =3D @id ) - INTERVAL 90 DAY );=20 EndLoop. Since there is no rush for the purging, there is little need to optimize it= other than to keep it from interfering with other queries. To that end, t= he compound index I propose is important. > -----Original Message----- > From: Daevid Vincent [mailto:daevid@stripped] > Sent: Thursday, October 25, 2012 1:33 PM > To: Rick James; mysql@stripped > Subject: RE: Help with purging old logs for each customer ID >=20 > Well, the customer_id is relevant in that I want the last 90 days > relative to each customer. >=20 > customer_id =3D 123 might have logs from jan - mar customer_id =3D 444 > might have logs from feb - may >=20 > So it's a rolling log of THEIR last 90 days from their last log (most > recent) back 90 days from there. >=20 > Does that make more sense? >=20 > I guess I was trying to avoid looping over every customer ID and > computing if I could help it. I thought by using a GROUP BY or > something it could group all the logs for a given customer and then > trim them that way. >=20 > But maybe brute force is the way to go? >=20 > > -----Original Message----- > > From: Rick James [mailto:rjames@stripped] > > Sent: Thursday, October 25, 2012 1:09 PM > > To: Daevid Vincent; mysql@stripped > > Subject: RE: Help with purging old logs for each customer ID > > > > Off hand, I would iterate over the PRIMARY KEY, looking at a thousand > > rows at a time, DELETEing any that need to be purged. I would use a > > Perl or > PHP > > loop, or write a stored procedure. More discussion of "huge deletes" > (which > > this _could_ be): > > http://mysql.rjweb.org/doc.php/deletebig > > (PARTITIONing does not apply in your case, as I understand it.) > > > > I like the loop: > > > > SELECT @z :=3D customer_log_id FROM customers_log LIMIT 1000, 1; > > DELETE FROM customers_log > > WHERE customer_log_id >=3D @left_off AND customer_log_id < @z > > AND created_on < NOW() - INTERVAL 90 DAY; > > sleep a few seconds (to be a nice guy) > > > > Plus code to take care of iterating and terminating. > > > > That loop could be done continually. > > > > It seems that customer_id is irrelevant?? > > > > > -----Original Message----- > > > From: Daevid Vincent [mailto:daevid@stripped] > > > Sent: Thursday, October 25, 2012 11:46 AM > > > To: mysql@stripped > > > Subject: Help with purging old logs for each customer ID > > > > > > I have a customer log table that is starting to rapidly fill up (we > > > have hundreds of thousands of users, but many are transient, and > use > > > the service for a few months, or use the free trial and quit, etc.) > > > > > > CREATE TABLE `customers_log` ( > > > `customer_log_id` bigint(20) unsigned NOT NULL auto_increment, > > > `customer_id` int(10) unsigned default '0', > > > `created_on` timestamp NOT NULL default CURRENT_TIMESTAMP on > > > update CURRENT_TIMESTAMP, > > > `type` > > > > enum('View','Action','Admin','Search','Login','Logout','Access','Gen > > > era > > > l','A > > > PI'), > > > `source` enum('web','mobile','system'), > > > `body` text, > > > PRIMARY KEY (`customer_log_id`), > > > KEY `created_on` (`created_on`), > > > KEY `customers_id` (`customer_id`) > > > ) ENGINE=3DInnoDB > > > > > > What I'd like to do now is make a 'rolling log' in that I want to > > > DELETE any entries older than 90 days for EACH `customer_id`. > > > > > > I'm not sure how to do that in a query? I'd rather not iterate over > > > each customer_id if I can help it. > > > > > > > > > > > > > > > -- > > > MySQL General Mailing List > > > For list archives: http://lists.mysql.com/mysql > > > To unsubscribe: http://lists.mysql.com/mysql > > > > > > -- > > MySQL General Mailing List > > For list archives: http://lists.mysql.com/mysql > > To unsubscribe: http://lists.mysql.com/mysql