Nathan Harmston schrieb:
> Hi everyone,
> I am currently working on an application where I have a very large
> table called intervals ( atm its 80 000 000 records and growing ), and
> a smaller table ( token ) which join with it.
> interval is just an id, start, end, word
> token is id, interval_id, type, processed_by
> There is a many to one......ie one interval can have many tokens. The
> idea being that I as used different tools I generate different tokens
> but the underlying intervals tend to be the same. When I add a new
> token I first need to search the intervals table to see if one exists
> in there. Of course theres an index on intervals to deal with this.
> But as I add more and more new fields I notice a massive slow down in
> processing. I think this due to the increase in new records being
> added and indexed. The problem is I can't turn indexes off as I have
> 80 million records.
> Does anyone have any suggestions for optimising this design? Or where
> to start from? One option and at the moment the only option I have is
> to denormalise my schema but this will complicate stuff at the
> application level considerably.
your Problem sound like a time series. The problem i am dealing with.
IMHO There is no real solution, we have splitted the TS and use several tables.
(1 table per month here but it depends on your exact problem). That moves the problems
"what table" to the application. Since MySql 5.1 there are partitions, that moves
the problem back to the DB.
The "solution" depends on your usage pattern. Clearly there is nothing like a
"clever" select statement.
For you token-Problem you can use immodb for foreign keys, if insert fail simply check
if you need to add an other token in the token table.
NTL you need to decide how long you will store, lets assume you have 1E6 Data / day and
you want to store for 40 Year then you get 40*360*1E6 Data do you want to handle that ?