From: Sergei Golubchik Date: November 9 2009 8:37am Subject: Re: how does one know if their storage engine supports replication? List-Archive: http://lists.mysql.com/internals/37466 Message-Id: <20091109083703.GA23957@janus.mylan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Hi, Zardosht! On Nov 08, Zardosht Kasheff wrote: > > I realize this is a bit of a general question, but I am not sure where > to start looking. > > How does one determine if a storage engine supports replication/binary > logging? > > I see that a storage engine must expose the flags > HA_BINLOG_ROW_CAPABLE and HA_BINLOG_STMT_CAPABLE, but I do not see > anything the storage engine needs to actually do. What is the > contract? For other flags, exposing them implies something that the > storage engine can do. For example, setting > HA_PRIMARY_KEY_IN_READ_INDEX implies that the storage engine must be > able to retrieve the primary key when doing a scan of a secondary > index. > > However, for binary logging, I do not see any functions the storage > engine must implement and I cannot find any contract the storage > engine must fulfill. However, the existence of the flags makes me > think that the storage engine must be doing something. > > Any ideas on what I should read or where to look? In InnoDB handler, for example you can see if (tx_isolation <= ISO_READ_COMMITTED) return int_table_flags; return int_table_flags | HA_BINLOG_STMT_CAPABLE; that is it announces support for statement level binlogging only on high isolation levels. Otherwise, indeed, if: trx1> insert t1 select * from t2 where t2.id > 10; trx2> insert into t2 (id) values (20); trx2> commit; trx1> commit; because transactions are written in binlog at commit time, the second transaction would be written *before* the first one: binlog> insert into t2 (id) values (20); binlog> commit; binlog> insert t1 select * from t2 where t2.id > 10; binlog> commit; which, obviously, produces results different from the original sequence. On higher isolation levels trx2 insert would block until the first transaction commits. As you can see, InnoDB prohibits statement level binlogging on low isolation levels because it may produce incorrect results. I didn't find an example of the engine that doesn't support row-level binlogging, though Regards / Mit vielen Grüßen, Sergei -- __ ___ ___ ____ __ / |/ /_ __/ __/ __ \/ / Sergei Golubchik / /|_/ / // /\ \/ /_/ / /__ Principal Software Engineer/Server Architect /_/ /_/\_, /___/\___\_\___/ Sun Microsystems GmbH, HRB München 161028 <___/ Sonnenallee 1, 85551 Kirchheim-Heimstetten Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel Vorsitzender des Aufsichtsrates: Martin Häring