nigel wood <nwood@stripped> wrote on 14/03/2006 13:09:08:
> Alec.Cawley@stripped wrote:
>
> >"Paul Halliday" <paul.halliday@stripped> wrote on 14/03/2006 12:09:10:
> >
> >
> >
> >
> >>As an example:
> >>
> >>There was a table called event.
> >>
> >>This table is now broken up like this:
> >>
> >>event _<sensor>_<date>.
> >>
> >>So for every sensor, and every day, there is now a new table. So if I
> >>have 20 sensors, every day I will have 20 new tables.
> >>
> >>With this in mind, does this design make sense?
> >>
> >>how will this scale?
> >>
> >>Is there anything I can do through configuration (I doubt the
> >>developer will change the design) to speed things up? or a workaround
> >>that I could do on my end to compensate?
> >>
> >>
> >
> >Could you explain how this is meant to improve scalability? Because to
my
> >mind it is probably the best way I can imagine to make the system
> >unscaleable. To me, this design very much does *not* make sense.
> >
> >You have bought, in MySQL, a highly tuned specialist engine for
seqrching
> >and sorting stuff in the most efficent manner. And then you have said
that
> >you will disable all its optimisation and force it into a linear
search.
> >
> >
> >
>
> I can think of a reason for doing this but not to extent described. Is
> your developer trying to create a situation where it's easy to archive
> of results earlier than a given day? So you store say 1000 days of data
> and can quickly archive the oldest day at midnight each day.
>
> Assuming this is the case: There's no point splitting further than by
> day so tables per day/sensor don't make any sense unless your worried
> about sub second locking (i.e. doing it wrong). You should make the
> unmerged tables as large as possible without the time to delete having
> an impact on your application. Having an impact depends on your
> applications tolerence to locking and the amount of data your adding and
> removing, you'll need to find it by testing. The table type you use will
> have a big impact on concurrent access locks. MyiSAM and Innodb are the
> two main candidates MyISAM is quick but is doesn't allow concurrent
> access to the table. Innodb will allow concurrent access but still locks
> rows and can lock the 'head point' during certain inserts.
>
> The fact your storing sensor data worries me. How tolerent of
> lag/locking on insert or retreval is your application? If it's sensitive
> to more than a seconds lag you need a careful review of your design. If
> it's hard real-time sack the developer then review the design.
I take your point to a certain extent. Of course, in the end it comes down
to the searches being used. I would make it a rule of thumb that any
search which requires more than a 10 tables is a Bad Thing. So if the very
large majority of searches are for 1-4 sensors over 1-4 days, this
architecture might make sense. But if searches are over >10 sensors or >10
days, this architecture will b become astoundingly inefficient.
Generally, I would expect MERGE tables to be used on much larger lumps of
time. If you have tables per month, any random period of a month can be
checked very efficiently by merging two tables - the start month and the
end month. This is the sort of thing that people tend to want to do. The
OP of course knows his application, but I think it unusual for people to
slice queries that small.
And if the queries are of the form "upon which days/sensors did event X
happen", then splitting the table up is a one way path to doom.
Alec