List:General Discussion« Previous MessageNext Message »
From:Peter Brawley Date:January 13 2006 6:16pm
Subject:Re: Huge number of tables with InnoDB
View as plain text  
Alec writes

/>...Generally, Log A + log B is bound to be larger than log (A*B)... /

Errm, log A + log B exactly = log(A*B) :-) .

PB

-----

Alec.Cawley@stripped wrote:

>To reply to this, I think we have to understand why you have chosen to 
>split the tables at all. It seems to me that this, by introducing a 
>two-level lookup, is certain to be slower than any possible single table 
>lookup. Generally, Log A + log B is bound to be larger than log (A*B). 
>
>It appears that you are querying prediminantly by time. In this case, your 
>index *must* start with the timestamp, not the monitor ID. I would suggest 
>that you need an index on timestamp, and possible one on monitor ID - but 
>not if, as you say, you never query by monitor ID at all. Do you need a 
>PRIMARY KEY at all? In what way will your system break if there happen to 
>be two entries with the same timestamp and monitor ID? Presumably this 
>will reflect two events very close together: Wouldn't you rather store 
>that fact rather than lose it? 
>
>Generally, I would query your decision to have multiple tables by date to 
>whatever. In my experience, whenever I have introduced such concepts into 
>my early designs, they have disappeareed later into a better design. It 
>looks to mee as if you are using a tool optimied to do fast searches on 
>large databases, then crippling its ability to optimise.
>
>I would expect the use of thousands of tables effectively to disable 
>MySQL's caching capability, which is one of the biggest performance 
>boosters.
>
>        Alec
>
>
>
>
>
>
>"John McCaskey" <johnm@stripped> 
>13/01/2006 17:20
>
>To
>"MySQL" <mysql@stripped>
>cc
>
>Subject
>Huge number of tables with InnoDB
>
>
>
>
>
>
>Hi everyone,
>
> 
>
>I'm running MySQL 4.0.18 on Debian with a 2.6 linux kernel using ext3 as
>the underlying filesystem for the database storage.
>
> 
>
>I currently have some InnoDB tables with the following structure:
>
> 
>
>Log_20060101 {
>
>            Monitor_id medium int,
>
>            Timestamp timestamp,
>
>            Avg float,
>
>PRIMARY KEY Monitor_id, Timestamp
>
>}
>
> 
>
>We partition these tables by date as you can see as they grow very large
>and they get to be slow to insert and query to over time.  We have the
>idea to change the partitioning so the tables are as follows:
>
> 
>
>Log_[monitor_id] {
>
>            Timestamp,
>
>            Avg float,
>
>PRIMARY KEY Timestamp
>
>}
>
> 
>
>This seems to have several key advantages:
>
> 
>
>1)       Reduced disk space usage 
>
>2)       Easier querying of data across time (but not across individual
>id's, it turns out that doesn't ever really happen in our data usage
>anyway though)
>
>3)       Smaller tables, resulting in faster reads/writes, also smaller
>data volume hopefully also resulting in faster reads/writes due to less
>disk io neccesary
>
> 
>
>However, in our actual testing the 'faster writes' expectation is
>getting shot down.  With 20,000 unique monitor id's and 8928 unique
>timestamps inserting to the old set of tables (20051201-20051231 in this
>case) is taking me about 4 hours 20 minutes.  Inserting to the 20,000
>new tables (Log_0-Log_20000) is taking about 10 hours.  I expected this
>to be much faster as I hoped finding the right table for an insert would
>be a Hash type lookup taking linear time, while inserting into the large
>tree structure in the old tables which have a much higher volume of rows
>would be logarithmic time.  Where did I go wrong?  My only real thought
>so far is the disk subsystem of the OS being slow with large numbers of
>files, but I thought it wouldn't matter for InnoDB as the data storage
>is all one file.  Does anyone know what would cause the inserts to be so
>much slower? 
>
> 
>
>John A. McCaskey
>
>Software Development Engineer
>
>Klir Technologies, Inc.
>
>johnm@stripped
>
>206.902.2027
>
> 
>
>
>
>
>  
>

Attachment: [text/html]
No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.371 / Virus Database: 267.14.17/228 - Release Date: 1/12/2006
Thread
Huge number of tables with InnoDBJohn McCaskey13 Jan
  • Re: Huge number of tables with InnoDBAlec.Cawley13 Jan
    • Re: Huge number of tables with InnoDBPeter Brawley13 Jan
RE: Huge number of tables with InnoDBJohn McCaskey13 Jan