List:Internals« Previous MessageNext Message »
From:Weldon Whipple Date:September 17 2010 1:04am
Subject:Re: Seeking command syntax feedback (Still Re: Per-db binlogging)
View as plain text  
On Thu, Sep 16, 2010 at 6:15 PM, Davi Arnaut <Davi.Arnaut@stripped> wrote:
> On 9/16/10 6:28 PM, Weldon Whipple wrote:

<snip/>

>> I'll try to implement the START, SHOW and STOP commands in 5.5, and
>> see how it works. (On second thought, maybe I'll try to make
>> binlog-do-db a global variable in lieu of the START and STOP commands,
>> since I haven't tried it yet in this project. I guess I'll keep SHOW.)
>>
>> Hmmm... I wonder how a list of binlog-do-db's would work as a global
>> variable. ... And how would I detect that a db's binlogging is to be
>> stopped by examining the binlog-do-db global variable? (Would that db
>> just be missing from the list? I don't feel comfortable using
>> binlog-ignore-db to turn it off. ... If I wanted to subsequently turn
>> on per-db binlogging for that database, I'd have to remove it from the
>> binlog-ignore-db and add it back into binlog-do-db ...)
>
> If there are going to be dozens of different dbs to binlog, I would suggest
> using a table. Otherwise just keep a global string variable which is guarded
> by a mutex. See sys_vars.cc for examples of variables (grep for
> Sys_var_charptr) or use the plugin API.
>
> The system variable class also has hooks for reading and updating the
> variable, which means you can parse the string and store it in a more
> convenient format.. like in a hash table or something like.

Yes, we could end up having dozen's of different databases being
binlogged--each destined for migration to another physical server.

When I came on board at my current company and was assigned to MySQL
nearly two years ago, my first exposure was to Google's userstatv2 and
Percona's related patches. My first assignments involved merging, then
enhancing those patches. They make heavy use of a hash, which is
manifested by a table in INFORMATION_SCHEMA, formatted in sql_show.cc.
 Since that time I've implemented quite a few "hacks," many of them
implemented the same way as USER_STATISTICS, CLIENT_STATISTICS,
TABLE_STATISTICS and INDEX_STATISTICS.

I had thought of doing it one more time with per-db binlogging: Using
a hash that contains information about each database that is being
logged. In several of the prototypes, the struct (defined in
structs.h--just like with userstatv2) also contained a reference to
the DB_BINLOG instance (inherited from MYSQL_BIN_LOG). Each instance
included a read/write mutex that was requested/released on entrance
to/exit from each member function (method). (Hmmm... I MIGHT have used
such a lock that existed in MYSQL_BIN_LOG, if I noticed such a lock).

The HASH was part of a singleton DB_BINLOG_MGR, which handled all
DB_BINLOG requests.

It seems to work logically, but I wondered if the DB_BINLOG_MGR might
be a bottleneck. One of my more recent prototypes had a pointer in THD
that referenced the current instance of DB_BINLOG (that is referenced
in the HASH). I have to admit that I haven't thought it through
completely. I wanted to handle the case where the client issues a STOP
BINLOG request but threads might still be active for the specified
database. Rather than just delete the DB_BINLOG instance (which would
leave THD's pointing to unallocated memory somewhere), I thought of
having a flag that indicates if the instance is active or not. (The
STOP BINLOG request would indicate that the database should be no
longer binlogged.) That still seemed to have problems, so I considered
reference counting, with the DB_BINLOG instances commiting suicide
when the reference count reached 0.

I've even considered never releasing DB_BINLOG instances until mysqld
ends. (That doesn't seem like a good thing.)

Unless someone changes my mind, I'll probably forego the THD pointer
and see how much of a bottleneck there is. If there IS a bottleneck,
I'll likely use reference counting ...?)

I will DEFINITELY study the sys_vars.cc code you mention. (It occurs
to me that I've gotten into a major rut with the hash/sql_show.cc way
of doing things. ...)

Thanks again for the tips! I'll definitely look at them!

>
> Regards,
>
> Davi

Weldon

P.S. I'm certain there must be other ways of making tables in
INFORMATION_SCHEMA. After all, there are MANY tables there, and only a
few of them show up in sql_show.cc. Would it be better not to use the
HASH/sql_show.cc method?
Thread
Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple15 Sep
  • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Sergei Golubchik15 Sep
    • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple16 Sep
      • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Sergei Golubchik16 Sep
    • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Konstantin Osipov16 Sep
      • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple16 Sep
        • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Davi Arnaut16 Sep
          • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple16 Sep
            • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Davi Arnaut17 Sep
              • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple17 Sep
  • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Konstantin Osipov16 Sep
    • Re: Seeking command syntax feedback (Still Re: Per-db binlogging)Weldon Whipple16 Sep