List:General Discussion« Previous MessageNext Message »
From:Marcus Bointon Date:July 1 2009 5:57pm
Subject:Best approach for DB-based event logging?
View as plain text  
I need to log fairly large numbers of historical events relating to  
mailing list activity on a per-recipient basis, a kind of audit trail  
if you like. So for a given user, I might log the fact that they  
subscribed to a list, that they were uploaded by someone, that they  
were sent a message, that they unsubscribed etc. For the most part  
this is write-only and is only ever read very rarely, but when I do,  
it will be to retrieve the details of a single user, and all I need is  
the whole history, not individual events. At present I'm logging by  
appending to a text blob field for each recipient. This works ok, but  
the append gets progressively slower over time as the append speed is  
dependent on the length of the existing field, which I think makes for  
an O(n^2) complexity overall, which is obviously not good. A typical  
choke point is after a list upload when I need to batch-update records  
for everyone on the list, which may be half a million or more records.

I'm not bothered about current live data as that's not such a problem  
- this is just about the historical data.

I'm looking for a good alternative to this. So far I've looked at  
archive tables for logging on a per-event basis (can't do per-user as  
it doesn't do updates), however, with a large number of users and  
events, this would rapidly grow by several million records per week,  
and I suspect searches would become unusably slow (no index in that  
storage engine). I thought of using memcachedb instead, however,  
unless I threw large numbers of servers at it, I would simply be  
moving my bottleneck to whatever back-end memcachedb uses (sqlite?  
bdb?) which is probably less efficient than MySQL anyway. Partitioning  
isn't a great solution as that's mainly for improving select  
performance. Keeping a file on disk for each user might work, but I  
suspect that would become impractical as I have millions of users, and  
files on disk is just a kind of inefficient home-brew database.

So, any other ideas?

Marcus Bointon
Synchromedia Limited: Creators of
UK resellers of info@hand CRM solutions
marcus@stripped |

Best approach for DB-based event logging?Marcus Bointon1 Jul
  • Re: Best approach for DB-based event logging?Johan De Meersman3 Jul
    • Re: Best approach for DB-based event logging?Marcus Bointon3 Jul
      • Re: Best approach for DB-based event logging?Johan De Meersman3 Jul
  • Re: Best approach for DB-based event logging?nigel wood3 Jul
    • Re: Best approach for DB-based event logging?nigel wood3 Jul
      • RE: Best approach for DB-based event logging?Martin Gainty3 Jul