List:Internals« Previous MessageNext Message »
From:Weldon Whipple Date:September 14 2010 2:31pm
Subject:Re: Per-db binlogging
View as plain text  
Mark,

Thank you very much for your quick response! (See my comments below.)

On Mon, Sep 13, 2010 at 6:02 PM, MARK CALLAGHAN <mdcallag@stripped> wrote:
> +maria-discuss as they are working in replication too
> +robert.hodges as tungsten replicator has features that could be used here
>
> I am very interested in a subset of this. I want one SQL thread per
> database on a slave (stream per database on the slave).

Our immediate needs don't require special slave enhancements, but are
limited to on-the-fly creation of BINLOGs on a "master" (sort of)
server--the "master" being a server that a customer is migrating FROM.
We need BINLOGs only long enough to ensure that an exact copy of the
customer's databases have been successfully copied to the new physical
server.

(After our immediate needs have been satisfied, I can imagine a future
enhancement to slave functionality that would let the "destination"
server replicate--on the fly--a specified [per-database] BINLOG on
some other server. That isn't--yet, at least--a requirement for me to
satisfy.)

> However I
> don't need to split the binlog on the master to get that and writing a
> binlog file per database on the master might complicate the
> sync_binlog option that keeps the binlog and InnoDB in sync during
> crash recovery. How will this work in that case?

I need to review the "sync_binlog option that keeps the binlog and
InnoDB in sync during crash recovery." If I remember correctly, we use
the configuration option that stores each InnoDB database in separate
file(s). I don't know if that makes any difference. Unlike MyISAM
files--which can be scp'ed from one box to another (if done
carefully), we would need to use dump commands/utilities for the
InnoDB databases that we migrate from one server to another.

> Note that a stream (SQL slave thread) per database on the slave will
> break rpl_transaction_enabled. Some of us rely on that to keep slaves
> consistent during crash recovery. I assume you don't use that feature.

We have used standard master-slave replication from time to time in
the past. This particular assignment isn't a "standard" use of
replication (although we ARE--by definition--replicating one
customer's databases on another server during the process of
migration.

Our current plan is to use something like the mysqlbinlog tool on the
destination server to make sure that the source and destination are in
sync before "flipping the switch."

(I admit that migrating a very active customer is always a challenge.
Perhaps if a second phase allowed a slave at the destination server to
begin reading a specified binlog file at a particular offset on a
particular remote server, that would definitely be Nirvana--especially
if it worked reliably and was simple to start/stop ...)

> MySQL has worklogs to use db tables in place of relay-log.info and
> when that code is production ready, then this is not a problem.

I would like to learn more about worklogs. Do you know of any
documents I could read?

Thanks again for your response!

Weldon Whipple
weldon@stripped

(P.S. A very small number of customers have a very small number of
databases that reference each other. We are intentionally overlooking
those cases in our first phase. It might be nice to be able to deal
robustly with those at some time in the future.)

>
> On Mon, Sep 13, 2010 at 1:52 PM, Weldon Whipple <weldon@stripped> wrote:
>> Greetings!
>>
>> My employer has asked me to implement per-DB binlogging for
>> MySQL. I've been working on prototypes for 3-4 weeks now, and decided
>> it's time to post to this list to ask for comments, suggestions,
>> condolences, etc. (I've been a "lurker" for quite awhile. I hope this
>> post isn't too far off the mark) The assignment is *only* for master-side
>> binlogging. (It doesn't require a slave/replication-side implementation.)
>>
>>
>> Why I Want to Do It.
>> -------------------
>>
>> I work for an ISP that hosts (last I heard--I HOPE I'm not lying) over
>> 2 million domains. Customers are allowed to have (almost) unlimited
>> MySQL databases. Migrating an account from one physical server to
>> another is a daunting task. (A typical server has easily 2000+
>> [sometimes far more] databases.) Global mysqld binlogging isn't
>> feasible when we want to migrate (for example) a single account to
>> another server.
>>
>> Our proposed scenario goes something like this:
>>
>> 1. START USER BINLOG FOR DATABASE <db>
>> 2. FLUSH USER TABLES WITH READ LOCK FOR DATABASE <db>
>> 3. SHOW USER BINLOG STATUS [WHERE DATABASE LIKE <pattern>]
>>
>> SHOW output might look something like:
>> =================================
>> mysql> show user binlog status;
>>
> +----------------------+----------+---------------------------------+------------+
>> | Database             | User     |
> Binlog_file                     |
>> Binlog_pos |
>>
> +----------------------+----------+---------------------------------+------------+
>> | booklibs             | wwhipple |
> booklibs-bin.000003             |
>>     248 |
>> | charmins_gravestones | charmins | charmins_gravestones-bin.000001 |
>>       4 |
>>
> +----------------------+----------+---------------------------------+------------+
>> 2 rows in set (0.01 sec)
>>
>> The "where" clause also works:
>>
>> mysql> show user binlog status where user like 'charmins';
>>
> +----------------------+----------+---------------------------------+------------+
>> | Database             | User     |
> Binlog_file                     |
>> Binlog_pos |
>>
> +----------------------+----------+---------------------------------+------------+
>> | charmins_gravestones | charmins | charmins_gravestones-bin.000001 |
>>       4 |
>>
> +----------------------+----------+---------------------------------+------------+
>> 1 row in set (0.01 sec)
>> =================================
>>
>> At this point the migration tools note the binlog coordinates and
>> start copying the databases to the new server. (For myisam tables, the
>> tools writers want to just copy [scp?] the files. For innodb, they
>> will use something like mysqldump.)
>>
>> (Note: We might want to "lose" the "User" column in the above table]
>> in our recent designs.)
>>
>> Then the sequence of MySQL commands resumes:
>>
>> 4. UNLOCK USER BINLOG FOR DATABASE <db>
>>
>> Finally, when migration is complete:
>>
>> 5. STOP USER BINLOG FOR DATABASE <db>
>>
>> A high percentage of users (infrequent bloggers, etc.) go several days
>> (sometimes even weeks or months) without updating their databases. For
>> them--after migrating the files to the new server, a final check of
>> the binlog coordinates on the old box can verify that nothing has
>> changed.
>>
>> If it HAS changed, migration can copy the binlog file(s) to the new
>> box and use the mysqlbinlog command (specifying the starting
>> coordinates) to bring the new database in sync with the old one.
>>
>> (This scenario is probably oversimplified, ignoring things like DNS
>> caching, TTL, etc., etc. Hopefully it doesn't include too many lies.)
>>
>>
>> How I Plan to Do It.
>> -------------------
>>
>> I've noticed that:
>>
>> 1. The class MYSQL_BIN_LOG seems to be involved in most binlogging
>> 2. MYSQL_BIN_LOG inherits from TC_LOG and MYSQL_LOG. It is created in
>> mysqld.cc at startup.
>>
>> My current partial implementation has:
>>
>> 1. Class DB_BIN_LOG inheriting from MYSQL_BIN_LOG, adding members like
>>  db_name
>>  db_binlog_name
>>  db_index_name
>>  and methods to access them, etc.
>>
>> 2. Singleton class USER_BIN_LOG_MGR. (Should probably be DB_BIN_LOG_MGR),
>>  which maintains a db_binlog_hash of information about each DB being
>>  binlogged. A DB_BIN_LOG_ENTRY (in the hash) contains:
>>    dbname
>>    locked (bool)--has it been locked by FLUSH ... TABLES WITH READ
> LOCK?
>>    binlog file
>>    binlog_pos
>>    is_active
>>    a pointer to a DB_BIN_LOG instance
>>
>>  USER_BINLOG_MGR has a rw lock to protect access to its hash of
>>  currently binlogging db's. (Most accesses will probably be
>>  read-only?)
>>
>> Is it plausible to identify all places where a binlog addition might
>> occur? In those places, if the singleton USER_BIN_LOG_MRG exists, ask
>> it if the current DB is being binlogged. If so, send the request to
>> the DB_BIN_LOG instance.
>>
>>
>> Questions I Have.
>> ----------------
>>
>> 1. I notice that binlog is in the list of engines and plugins
>>  (displayed by SHOW ...). The bottom of log.cc has
>>  "mysql_declare_plugin(binlog)". Do I need one for db_plugin?
>>  (probably not?)
>> 2. I've diagrammed the binary log event taxonomy--the 24 classes that
>>  inherit from Log_event (in log_event.cc/h). That was my main
>>  motivation for wanting to inherit from MYSQL_BIN_LOG.
>> 3. On the other hand, I wonder if inheriting from MYSQL_BIN_LOG might
>>  be overkill for "master only" binlogging--without the
>>  slave/replication part of it.
>> 4. What version should I be working on? (We're currently running
>>  5.1.47.) What should I check out of bzr/launchpad? (Is that still
>>  what we're doing with the changes in ownership?)
>> 5. I see that THD is involved in binlogging. (Several THD methods are
>>  defined in log.cc.) Are there any caveats here?
>> 6. What files should I use? A new h/cc pair? ... or just add to the
>>  hierarchy in log.h/cc?
>>
>> In an ideal world, it would be nice to inherit polymorphically from
>> MYSQL_BIN_LOG, overriding methods that set the names of the binlog
>> file and index file (etc.) then just call the methods and have it do
>> the "right thing." On the other hand, I get the impression that I
>> might be oversimplifying.
>>
>> I welcome your advice!!
>>
>> What I've Implemented So Far (about 13 Prototypes).
>> --------------------------------------------------
>>
>> (PRELIMINARY NOTE:)
>>
>> Our initial plan was to call it "Per-User Binlogging". The user here
>> is a cPanel [control panel] user. cPanel's database naming convention
>> is to have all of a user's database names start with the username,
>> followed by an underscore, followed by a distinctive database
>> name. Thus all the commands above start with
>>
>> <verb> USER ...
>>
>> Originally most of them could end either "DATABASE <db>" or "USER
>> <user>" (with user meaning cPanel user). Thus, in the original plan
>> the first command above was:
>>
>> START USER BINLOG FOR (USER <user> | DATABASE <db>)
>>
>> We have realized that per-database binlogging is all we really need.
>> We imagine that others in our situation might want something similar.
>> Since not everyone uses cPanel, we should forget per[cPanel]-user
>> binlogging, focusing on per-database binlogging.
>>
>> I have several versions of lex and bison/yacc files (sql_yacc.yy,
>> lex.h) to implement the above (and earlier variants). The recent
>> versions avoid increasing the %expect'ed 169 shift/reduce conflicts
>> specified in sql_yacc.yy.
>>
>> (END OF PRELIMINARY NOTE)
>>
>> I currently have 13 different Subversion branches with different
>> (partial) implementations.
>>
>> 1. All have some lex/bison grammer defined. (I'm not completely
>> satisfied with any of them.)
>>
>> 2. Per-db binlogging is enabled by two my.cnf (or command-line)
>> options:
>>
>> user_binlog_dir=/full/directory/path
>> user_binlog_persist_file=user_binlog.info
>>
>> Our servers have fast solid-state drives that we will use for per-db
>> binlogging. We want to specify a separate directory to store all the
>> per-db binlogs
>>
>> The persist_file stores information about the current state of
>> db-binlogging--to survive a restart.
>>
>> Just about every instance of "user_binlog" in this note should
>> probably be changed to "db_binlog" ... (???)
>>
>> 3. I have added cases in sql_parse.cc for
>>
>> SQLCOM_START_USER_BINLOG_DB:
>> SQLCOM_FLUSH_AND_LOCK_DB:
>> SQLCOM_SHOW_USER_BINLOG_STATUS:
>> SQLCOM_UNLOCK_DB:
>> SQLCOM_STOP_USER_BINLOG_DB:
>>
>> (as well as some "USER" variants like SQLCOM_START_USER_BINLOG_USER:,
>> which should probably abandoned.)
>>
>> They should probably be changed to something like:
>>
>> SQLCOM_START_DB_BINLOG:
>> SQLCOM_FLUSH_AND_LOCK_DB:
>> SQLCOM_SHOW_DB_BINLOG_STATUS:
>> SQLCOM_UNLOCK_DB:
>> SQLCOM_STOP_DB_BINLOG:
>>
>> -----------------
>>
>> Now that you've read it, please feel free to throw darts, etc. At this
>> point I'm willing to discard everytning and start over. (In the worst
>> case--if everyone on this list thinks per-db binlogging is a terrible
>> idea--I guess I'll be forced to create our own patch [sigh].)
>>
>> On the other hand, I would love to help contribute to this effort. (If
>> per-db binlogging is already underway, let me know, and I'll likely help.)
>> Thanks in advance for feedback/suggestions/help. Feel free to contact
>> me privately if you want to.
>>
>> Weldon Whipple
>> weldon@stripped
>>
>> --
>> MySQL Internals Mailing List
>> For list archives: http://lists.mysql.com/internals
>> To unsubscribe:    http://lists.mysql.com/internals?unsub=1
>>
>>
>
>
>
> --
> Mark Callaghan
> mdcallag@stripped
>
Thread
Per-db binloggingWeldon Whipple13 Sep
  • Re: Per-db binloggingMARK CALLAGHAN14 Sep
    • Re: Per-db binloggingWeldon Whipple14 Sep
      • Re: Per-db binloggingMARK CALLAGHAN25 Sep
        • Re: Per-db binloggingWeldon Whipple25 Sep
          • Re: Per-db binloggingMARK CALLAGHAN25 Sep
    • Re: [Maria-discuss] Per-db binloggingHenrik Ingo15 Sep
      • Re: [Maria-discuss] Per-db binloggingWeldon Whipple16 Sep
  • Re: Per-db binloggingSergei Golubchik15 Sep
    • Re: Per-db binloggingWeldon Whipple16 Sep
      • Re: Per-db binloggingSergei Golubchik16 Sep
        • Re: Per-db binloggingWeldon Whipple16 Sep
          • Re: Per-db binloggingSergei Golubchik17 Sep
          • Re: Per-db binloggingWeldon Whipple14 Dec