List:General Discussion« Previous MessageNext Message »
From:Jed Reynolds Date:November 20 2008 8:28am
Subject:Re: Using Replication in mySQL version 5
View as plain text  
Tompkins Neil wrote:
> Hi
> We are looking to upgrade our version of mySQL to the latest version of
> mySQL 5.  One of the main features we are going to think about using is
> replication for our website data.  Basically we have 2 websites located in
> the UK and US which share similar information, and we are going to be using
> replication as a way of keeping the data up to date and in sync.
>
> Based on your experiences, is there anything we should be aware of before
> investigating this route and putting it into practice ?
>   

I've had to take servers out for bad raid-controllers, bad ram, bad 
mobos. Disks have been the least of my problems. So make sure your 
architecture tolerates the ability to take members of your pool out 
without load-spiking the remaining members. And if you're doing 
filesystem snapshots from a master to a replicant, you will have to 
either have policy or extra servers available to maintain your uptime 
when you interrupt the master to flush all the tables, sync the 
filesystem and do an LVM snapshot. Innodb would require a shutdown. 
Don't forget that LVM snapshots are copy-on-write, so when that master 
comes back up and starts processing modifying tables, you'll get amazing 
system load on a busy system as your file system starts madly copying 
extents into the snapshot volume.

Define a procedure for junior staff how to properly down and up a pool 
member. Like, if you get a disk-full on one member, and it borks 
replication, what's the step-by-step for a) determining if replication 
can re-establish after you do a FLUSH LOGS, b) under what conditions do 
you have to re-copy all data from one master to another because your 
replication window has expired and your logs have gotten flushed. Your 
replication binlogs get really big if you're pushing large materialized 
views regularly via replication, or your servers have fast disks, not 
enough size to handle a more than a weekend or whole day (for example) 
of neglect.

Define a procedure for checking your my.cnf files for correct 
auto-increment-* settings and server-id settings. Junior staff, and even 
senior staff rarely add more members to the pool, so these settings are 
often mistaken during a midnight maintenance hour. Procedure for adding 
members and changing master replication settings is very important. 
Often your DBA is not racking and changing the equipment.

Make sure that you have a good understanding of what kind of capacity 
you're growing at. I started a project with two four-core boxes with 
plenty of 15krpm disk and when they got into production, they regularly 
spiked to load 20 and 30. Not pretty. Not only had my old architecture 
refused traffic to lighten the load, my new architecture didn't. My data 
set was growing so fast my sort-buffer settings for the old servers were 
too small for new servers. I ended up with four DL380s with 8 cores per 
box. I really had to scramble to get more servers in there. The addition 
of two more read-only members really helped, and backups handled by 
replication to an off-site replicant.

Another load capacity warning: if your traffic is very spiky, and you 
get high-load conditions, I've seen reset/dropped connections and also 
plain old connection timeouts. So if you have RAM for 1024 connections, 
you prolly can't service 1024 connections when you've got table 
contention and connections from your web-nodes just start failing. If 
they fail for too long, then you have to do some FLUSH HOSTS to reset 
connection attempt counters.

I don't know what your application does, but I certainly monitor 
replication lag. Load spikes can certainly increase lag. I've had to 
move from single instances of mysql to mysqld_multi and separate 
databases by replication rate. Your monitoring should also track sql 
threads. You might need to define procedure on how to deal with 
pooling-out members that fall too far behind in replication.

I've written an iptables script to block webnode connections but allow 
sql pool member connections. I use this to take a member out to run 
table repairs or to lighten the load while it does replication catch-up.

WAN connectivity for replication is interesting! I did site-to-site 
transfer using stunnel. I had to negotiate weird Cisco 5502 VPN 
behavior. Copying gigs of myisam files between sites would knock over my 
vpn so I had to rate-limit using rsync --bwlimit. Bursting bandwidth 
charges were still brutal, though. Later, we ended up configuring CBQ 
(search freshmeat.net for cbq-init) on my backup replicant to limit 
bandwidth so it wouldn't provoke bursting charges.

Jed

Thread
Using Replication in mySQL version 5Tompkins Neil17 Nov
  • RE: Using Replication in mySQL version 5Mary Bahrami19 Nov
  • Re: Using Replication in mySQL version 5Jim Lyons19 Nov
    • Virtualizing MySQLShain Miley19 Nov
      • Re: Virtualizing MySQLSimon J Mudd20 Nov
        • Re: Virtualizing MySQLShain Miley20 Nov
          • Re: Virtualizing MySQLKeith Murphy20 Nov
          • Re: Virtualizing MySQLJoerg Bruehe20 Nov
            • RE: Virtualizing MySQLJay Blanchard20 Nov
              • (Q) FullText (UTF8)Timothy Little20 Nov
                • Re: (Q) FullText (UTF8)Santino20 Nov
              • Re: Virtualizing MySQLJoerg Bruehe21 Nov
          • Re: Virtualizing MySQLClaudio Nanni20 Nov
            • Re: Virtualizing MySQLMoon's Father21 Nov
  • Re: Using Replication in mySQL version 5Jed Reynolds20 Nov
Re: Using Replication in mySQL version 5Roman Eberle19 Nov