I have an application and database doing this right now. We run both
ways, a single db per customer and a single customer per db. Smaller
customers are in a shared enviroment and larger customers have their
own db, sometimes they even have their own server. We use a company_id
field in each table to identify the data.
On 3/17/09, Arthur Fuller <fuller.artful@stripped> wrote:
> Are these databases identical or merely similar? If they are structurally
> identical, I'd go for one database per customer. Then you have isolation,
> easy structure updates and above all, consistent front-end code, in
> whatever language that occurs. Just obtain the customer ID and then use the
> appropriate database. Everything else can remain the same.
> The only fly in the ointment concerns whether you'd ever have the customer
> need to cross databases. I would imagine that sort of thing is for internal
> use, not the customers. In that case, the performance hit if any won't
> impact upon the customer, just you.
> On Tue, Mar 17, 2009 at 8:21 PM, Daevid Vincent <daevid@stripped> wrote:
>> I'm writing a report tool wherein we have many customers who subscribe
>> to this SaaS. There are millions of rows of data per customer. All
>> customers are islands from each other (of course).
>> Are there any major issues or benefits between storing each customer in
>> their own database (with their own tables), or all lumped into a single
>> At first thought, it seems that by separating them, queries should be
>> faster no (as there is less data to sift though per customer)? It of
>> course makes upgrading table schema a wee bit more cumbersome, but a
>> simple loop and script can handle that easily enough. And since you can
>> query across databases, we can still make internal aggregate reports for
>> our own usage.
>> For example: SELECT * FROM customerA.foo.bar JOIN customerB.foo.bar; or
>> we can use UNIONS etc. too.
>> Consolidating them into one would seem to bloat the tables and slow
>> things down (or is the fact that mySQL uses B-Trees invalidate that
>> theory)? It also makes us have to have a customer_id entry in every
>> table basically (or some FK to distinguish who's data is who's). It also
>> feels like it could leak data if a malformed query were to get through,
>> although I'm not terribly worried about this as we do some heavy UAT
>> before pushing from DEV to TEST to PROD.
>> Performance is a major factor concern here given our huge data sets
>> involved. Does joining across databases impose any speed/performance
>> hits vs. just joining across tables within a single database?