List:General Discussion« Previous MessageNext Message »
From:Rick James Date:April 2 2012 9:14pm
Subject:Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion
Wanted
View as plain text  
DRBD, SAN, etc.  Sure, they are highly redundant.  Sure they are 
reliable.  But they do not handle the building being in a 
flood/earthquake/tornado/etc.  If you want HA, you have to start with 
having two (or more) copies of all the data sitting in geographically 
distinct flood plains, etc.  If you have 3 copies of the data, why 
bother with redundant anything in each of the 3 sites?  HA within a 
device costs a lot of extra money.

You cannot get more than a few 9s, regardless of how you architect it.

Let's look at the design from a different point of view...
* How much data?
* How much traffic?
* It sounds like much of the data is images or videos; correct?  One 
could argue that those do not need to be in a database, other than a URL 
to them.  Then you can scatter copies around the globe, and use Akamai 
(or similar load balancing service) to not only find a viable copy, but 
find the nearest copy, thereby improving the "user experience"!
* You must put effort into splitting reads from writes.  Otherwise you 
are asking for extra vulnerability from hackers.  Yes, hackers will be a 
big issue, and you have not mentioned it yet.

You may need to scale via Sharding.  (Without some of the questions 
above answered, I cannot predict.)  If you do need sharding, you must 
incorporate it into the software design _early_.



On 4/2/12 1:05 PM, Wes Modes wrote:
> Thanks Shawn and Karen, for the suggestions, even given my vague
> requirements.
>
> To clarify some of my requirements.
>
> *Application:  *We are using an open-source application called Omeka,
> which is a "free, flexible, and open source web-publishing platform for
> the display of library, museum, archives, and scholarly collections and
> exhibitions."  Without getting into how free (or scalable) free software
> really is, we can view it as one aspect we cannot change, having been
> written into the grant requirements we received for the project.
> Experienced Omeka developers and our own developer have suggested
> that/it is not feasible to separate database writes from reads in the
> application/ (given time and resources).
>
> *SAN: *The SAN is a Dell EqualLogic 6100 which has redundant everything,
> including multiple NICs, controllers, and power.  So we are less
> concerned about the SAN being a SPoF.  On the other hand, if we have a
> single big MySQL server that fails, we could bring up another copy of it
> via VMWare, but until the server came up, the application would be dead
> in the water.  If the database is corrupted, service will be interrupted
> for a considerable time.
>
> *High Availability:*  It sounds like there is some debate over how to
> provide HA best, but do people really disagree on the desired results?
> Without getting into the many meanings of this buzz word, here's what we
> mean: /We desire to maintain high availability of service, allowing a
> small fraction of users to experience outage for only seconds at a
> time.  We desire to provide this through horizontal scaling, redundancy,
> failover planning, and external monitoring.  /
>
> *Scalability:  *Again, seems like there are lots of applications and
> implementation, but people agree on the general concept.  Here's what we
> mean for this project:  /We desire to  scale our services so that a
> usage surge does not cause unavailability of the services for some
> users.  We prefer to horizontally increase scalability using
> load-balancing strategies to treat clusters of servers as single logical
> units./
>
> The application may have not been designed with great scalability in
> mind, but if multiple application instances are accessing multiple
> database servers treated as one logical unit, that may not be too
> relevant.
>
> I am responsible for creating an architecture upon which this project
> will run.  I am not responsible for redesigning the application.   So
> far, no one has suggested anything that approached meeting our
> requirements, even our vague ones.  Perhaps I am asking the wrong list?
>
> Does anyone have any experience with MySQL Multi-Master Replication?
> Perhaps that should be a separate post.
>
> Wes
>
> On 3/30/2012 3:56 PM, shawn green wrote:
>> Hello Wes,
>>
>> On 3/29/2012 9:23 PM, Wes Modes wrote:
>>> First, thank you in advance for good solid suggestions you can offer. I
>>> suppose someone has already asked this, but perhaps you will view it as
>>> a fun challenge to meet my many criteria with your suggested MySQL
>>> architecture.
>>>
>>> I am working at a University on a high-profile database driven project
>>> that we expect to be slammed within the first few months. Since this is
>>> a new project and one that we expect to be popular, we don't know what
>>> kind of usage to expect, but we want to be prepared. Therefore, we are
>>> building in extra capacity.
>>>
>>> Our top goals are scalability and high availability, provided we hope
>>> through multiple MySQL nodes and VMWare functionality. I've been
>>> surprised that there are not more MySQL architects trying to meet these
>>> high-level goals using virtualization and shared storage (or at least
>>> they do not seem to be writing about it).
>>>
>>> I've looked at replication, multi-mastering, DRBD, clustering,
>>> partitioning, and sharding.
>>>
>>> Here's what we got, and some of our constraints:
>>>
>>> * We are concerned that One Big Database instance won't be enough to
>>> handle all of the queries, plus it is a single point of failure.
>>> Therefore, multiple nodes are desirable.
>>>
>>> * With the primary application that will be using the database, writes
>>> and reads cannot be split off from each other. This limitation alone,
>>> rules out replication, MMM, and a few other solutions.
>>>
>>> * We do not expect to be especially write-heavy.
>>>
>>> * We have shared storage in the form of an iSCSI SAN. We'd like to
>>> leverage the shared storage, if possible.
>>>
>>> * We have VMWare HA which already monitors hosts and brings them up
>>> within minutes elsewhere if we lose a host. So some of the suggested HA
>>> solutions are redundant.
>>>
>>> * We expect to have another instance of our system running in the Amazon
>>> cloud for the first few months while the traffic is high, so we may take
>>> advantage of RDS, though an exact duplicate of our local system will
>>> save us development work.
>>>
>>> Thanks for any advice you can give.
>>>
>>> Wes Modes
>>>
>> As the others have already pointed out, your specifications are rather
>> vague and appear to be only hardware-focused. I can state with some
>> certainty, through my six years of direct MySQL support experience,
>> that the majority of the problems related to performance are NOT
>> hardware related.  The majority of the problems I need to resolve are
>> related to bad queries, bad table design, and bad application design.
>>
>> For example, you state:
>>> * With the primary application that will be using the database, writes
>>> and reads cannot be split off from each other.
>> This is already a red flag that your application is not designed for
>> scale.
>>
>> Also, you say:
>>> * We are concerned that One Big Database instance won't be enough to
>>> handle all of the queries, plus it is a single point of failure.
>> then you say:
>>> * We have shared storage in the form of an iSCSI SAN. We'd like to
>>> leverage the shared storage, if possible.
>> This is another red flag in that your 'shared storage' once again
>> becomes a single point of failure.  Also, MySQL instances do not share
>> files. Period. Each MySQL instance needs its own, completely
>> independent, set of files and folders. Please take the hint that
>> Replication should be in your very near future.
>>
>> Here are some solid suggestions:
>>
>> a) modularize your code so that WRITES can go to a MASTER server and
>> READS can be distributed over a large number of SLAVE servers.
>>
>> b) modularize your data so that you can easily subdivide subsets of it
>> to different master-slave trees. For example, students whose last name
>> starts with A-D are assigned to machines in group1, E-K in group2, ...
>> etc.
>>
>> c) Optimize your queries to avoid subqueries and to take full
>> advantage of indexes.
>>
>> d) Build indexes wisely -  A well-crafted multi-column index can take
>> up less space and will be more functional that multiple single-column
>> indexes.
>>
>> e) Test, test, and retest - Until you can push your system to the
>> choking point in the lab, you will have no idea of how much traffic it
>> will be able to handle in the field.
>>
>> Good luck!
>

-- 
Rick James - MySQL Geek

Thread
HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes30 Mar
  • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedRick James30 Mar
  • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedKaren Abgarian30 Mar
  • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedshawn green30 Mar
    • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes2 Apr
      • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedRick James2 Apr
      • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedshawn green2 Apr
        • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes2 Apr
          • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion WantedTim Dunphy3 Apr
          • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedIan3 Apr
            • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture Suggestion WantedJohan De Meersman3 Apr
            • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes3 Apr
            • Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes4 Apr
        • Re: Replication rings/maatkit (was: HA & Scalability w MySQL + SAN+ VMWare)Wes Modes3 Apr
          • Re: Replication rings/maatkit (was: HA & Scalability w MySQL + SAN + VMWare)Eric Bergen3 Apr
Re: HA & Scalability w MySQL + SAN + VMWare: Architecture SuggestionWantedWes Modes4 Apr