"Martijn Tonies" <m.tonies@stripped> wrote on 14/03/2006 11:32:10:
> Hello Paul,
> I suggest you reply to the mailinglist :-) ...
> > The developer insists that for scalability issues, this was the
> > answer. It is likely, for example in my deployment, that these tables
> > would see upwards of 10 million records or more.
> Well, if there are problems with scalability, I guess you could
> split it up in a few (not 1600) tables and have them avaialble
> on different physical hard drives...
In my opinion, splitting things into merge tables has a *strong*
anti-scalability component. Searching a single table with indexes is O(log
n), whereas searching MERGE tables is O(n). Therefore, by splitting your
table into very many pieces, you sharply reduce your scalability in time
while increasing it in space.
Presumably, you want to scatter your table across several drives, so that
you will not have problems when you fill one drive. But you are never
likely to have 1600 drives, so 1600 is a ridiculously large number of
tables to split it into. You should probably split it into no more than
two or three times the largest number of disks you ever expect to have.
And even so, I would rather combine disks in RAID arrays rather than uses
separate tables. This can give you RAID protection as well as more disk
Which to you expect to run out of first, space or time? You seem to have
some heavyweight i.e. time intensive queries, which suggests that you will
run out of time first. If that is so, the requirement for scalablity says
that you should combine, not split, tables.
To quote Donald Knuth (derived from Hoare) "Premature optimisation is the
root of all evil". You should be sure that you are optimising in the right
place before you dive in: your problem suggests that you are trying to fix
that which is not broken, and breaking other things in the process.