From: Sergei Golubchik Date: April 14 2012 2:23pm Subject: Re: question on storage engines, two phase commits, and fsyncing on commit List-Archive: http://lists.mysql.com/internals/38499 Message-Id: <20120414142337.GA14182@meddwl.elevennetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Hi, Zardosht! You're absolutely right! Indeed, a storage engine is expected to sync both on prepare and on commit. You understanding is correct. But we're going to weaken this requirement in MariaDB (not surprisingly :). We've discussed the solution just about a month ago, and the corresponding task is "In Progress". I'll check what's exactly going on when I get back home (it would've been nice seeing you, I'm sorry you couldn't make it to Santa Clara this year). Regards, Sergei On Apr 14, Zardosht Kasheff wrote: > Hello all, > > Storage engines that support two-phase commit implement the functions > handlerton->prepare and handlerton->commit. Clearly, a storage engine > must fsync its own log after handlerton->prepare so that if we crash, > it may report the prepared transaction to MySQL via > handlerton->recover. > > My question is this: Do we need to fsync our log after > handlerton->commit, or can we somehow be guaranteed that if we do not > fsync, upon a crash, MySQL will have enough information to call > handlerton->commit_by_xid on what is a prepared transaction in our > storage engine? > > What I do not understand is what needs to happen after > handlerton->commit. Ideally, we would like to not have to fsync after > handlerton->commit so that we save an fsync. However, looking at code, > it seems that an fsync is necessary, at least in the case where there > is no binary log. In ha_commit_trans, I see: > > error=ha_commit_one_phase(thd, all) ? (cookie ? 2 : 1) : 0; > DBUG_EXECUTE_IF("crash_commit_before_unlog", DBUG_SUICIDE();); > if (cookie) > if(tc_log->unlog(cookie, xid)) > > If we crash after tc_log->unlog without having our log fsynced, then > we do not properly recover on a crash. > > So, my questions are: > - Is the same true for when a binary log exists? Do we need to fsync > our log on commit? > - Is my understanding of the non-binary log case correct in that we > need to fsync our log on commit? >