Cluster is not supposed to be a universal solution, for a reason. Universal solutions tend not to be very performant.
"If each application handles this sort of thing differently, then when I run all these applications on my server (and I do - we host about 175 web sites altogether) I have to configure each application separately, and I have to instruct all my users (many of them inexperienced grad students) to remember that "writes go here, reads go there" when they write their own PHP code."
Do you want geographic redundancy or do you want to scale reads? In this case you're talking about scaling reads for a bunch of apps all running together. If you want performance in that case, then first you'd want to isolate the apps from each other.
"And, of course, handling this sort of thing at the application level means that some applications will never support it, and therefore never be able to be geographically redundant."
Geographical redundancy is different: a dns record with a zero ttl, with a master->slave replication setup. Point the record a the master and if it fails, change the dns entry to point to the slave. Your applications never need to know about replication.
That’s even if you don't want to go with the more complex Linux HA or hardware based ip takeover solutions. There are many ways you could add redundancy without modifying the apps.
That's the great thing about open source software and techniques. They're like building blocks, and you can put them together however you want. I find this much more preferable to the all-in-one black-box solution.
From: Tim Gustafson [mailto:email@example.com]
Sent: Friday, September 25, 2009 4:18 PM
To: Gavin Towey
Subject: Re: Master/Slave Replication Question
> Moreover, it works today as opposed to waiting until the end
> of time for the database developers to add features like that
> (which mysql cluster is already a distributed database, and
> the devs have said they're not interested in trying to turn
> the regular mysql into a distributed product, instead they
> want to focus on what it does best)
With all due respect to the mySQL cluster people, setting up a mySQL cluster just isn't in the cards for lots of organizations. It's just too much. There's a huge implementation gap between a single mySQL server and a mySQL Cluster. I've also heard from people who have tried to implement mySQL clustering that wide-area cluster replication is hard or impossible (I can't remember which), so the ability to provide geographic redundancy (one of my requirements here) isn't workable.
I think saying that I'd have to wait until the end of time is a bit harsh. Sure, it's not going to happen tomorrow, but I wasn't expecting that anyhow.
I'm not sure if you've looked at the database integration for things like Drupal, but there will probably never be a way for Drupal to use an "updates go to this server, reads go to this server" configuration, as there are thousands of Drupal modules and almost all of them use the database directly, and each would have to be re-coded to work with the read/write split configuration.
And anyhow, I think that suggestion is missing the point:
If each application handles this sort of thing differently, then when I run all these applications on my server (and I do - we host about 175 web sites altogether) I have to configure each application separately, and I have to instruct all my users (many of them inexperienced grad students) to remember that "writes go here, reads go there" when they write their own PHP code.
And, of course, handling this sort of thing at the application level means that some applications will never support it, and therefore never be able to be geographically redundant.
So yeah, maybe lots of custom-written software handles the read/write split configuration well, but there's lots more that doesn't. I don't know of a single open source application that does.
So again, I go back to my original statement: replication is a database server problem, not an application problem. :)
Baskin School of Engineering
UC Santa Cruz
The information contained in this transmission may contain privileged and confidential information. It is intended only for the use of the person(s) named above. If you are not the intended recipient, you are hereby notified that any review, dissemination, distribution or duplication of this communication is strictly prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.