Hi Libing,
Nice Work.
Please find my review comments below.
Regards,
Luís
STATUS
------
Not Approved. (Requested some changes and clarification).
REQUIRED CHANGES
----------------
RC1. Please, improve the comments. See comments inline.
RC2. Please, make the comments doxygen style compliant.
REQUESTS
--------
R1. Let me see if I got this straight. This is about memory
management. You use the following check you use in
free_nodes_before:
((&(block->nodes[0]) <= node) &&
(&(block->nodes[BLOCK_TRANX_NODES]) >= node))
If a transaction requires more than BLOCK_TRANX_NODES, then
it needs two blocks. If at that time,
free_nodes_before(nodeY) gets called, and we are in the
following scenario:
1. nodeY is logically after nodeX;
2. &nodeY is in the second block;
3. and malloc has assigned the memory for this second
block from some chunk that was freed before and was
lying around in its list of free chunks (this is a
possible implementation of malloc - glibc uses it I
think);
4. and the first block was not taken from the list of
free memory chunks (meaning that it required a call to
brk, for instance).
Wouldn't this mean that for nodeY (which is logically after
nodeX), would be considered before nodeX? Wouldn't this
mess things up?
SUGGESTIONS
-----------
S1. In sql/rpl_handler.cc, maybe we could avoid the changes
regarding the initialization for Trans_param if we have a
default constructor for the structure in sql/replication.h
typedef struct Trans_param {
(...)
Trans_param() : server_id(0),
flags(0), log_file(NULL), log_pos(0) {};
(...)
} Trans_param;
I think that, yet another different approach, with less
changes, would be:
- Trans_param param;
+ Trans_param param= {0};
Have you considered? Your changes are OK, just asking
because in most places three out of four Trans_param members
are initialized and only if FOREACH_OBSERVER is called the
server_id member is initialized... Just feels like a loose
end... Anyway, nothing to be worried about... Your changes
are also fine. I leave it up to your consideration!
DETAILS
-------
See more comments inline.
On Wed, 2010-01-27 at 07:49 +0000, Li-Bing.Song@stripped wrote:
> #At file:///home/anders/work/bzrroot/mysql-5.1-rep-semisync/ based on
> revid:zhenxing.he@stripped
>
> 3127 Li-Bing.Song@stripped 2010-01-27
> BUG#50157 Assertion !active_tranxs_->is_tranx_end_pos(..) in
> ReplSemiSyncMaster::commitTrx
>
> The root cause of the crash is that a TranxNode is freed before it is used.
> A TranxNode is allocated and insertted into the active list each time
> when some log events are written into binlog file and is flushed.
suggestion: each time a log event is written and flushed into
the binlog file.
> The memory for TranxNode is allocted with thd_alloc and will be freed
typo: allocted -> allocated
> after at the end of the statement. The after_commit/after_rollback callback
suggestion: at the end of the statement
> was supposed to be call before the end of each statement and remove the node
> from
to be called
> the active list. However this assumption is not correct in all cases(e.g.
> CREATE
> TEMPORARY ... SELECT), and can cause the memory allocated for TranxNode be
> freed
> before it was removed from the active list. So The TranxNode pointer in the
> active
> list would become a wild pointer and cause the crash.
I would probably improve a bit the problem description... See
my question below.
Question: Is the after_commit/after_rollback hook called at
all in CREATE TEMPORARY ... SELECT ? If so, when and
why does it differ?
> After this patch, The memory for TranxNode will be allocated by my_malloc.
If I understood correctly, the patch introduces a memory manager
for TranxNode which is backed up by my_malloc.
> @ sql/rpl_handler.cc
> params are not initialized.
>
> modified:
> mysql-test/suite/rpl/r/rpl_semi_sync.result
> mysql-test/suite/rpl/t/rpl_semi_sync.test
> plugin/semisync/semisync_master.cc
> plugin/semisync/semisync_master.h
> sql/rpl_handler.cc
[snip]
> === modified file 'plugin/semisync/semisync_master.h'
> --- a/plugin/semisync/semisync_master.h 2009-12-04 01:46:33 +0000
> +++ b/plugin/semisync/semisync_master.h 2010-01-27 07:48:44 +0000
> @@ -20,6 +20,196 @@
>
> #include "semisync.h"
>
> +struct TranxNode {
> + char log_name_[FN_REFLEN];
> + my_off_t log_pos_;
> + struct TranxNode *next_; /* the next node in the sorted list */
> + struct TranxNode *hash_next_; /* the next node during hash collision */
> +};
> +
> +/*
> + TranxNodeAllocator provides memory allocating and freeing methods for
> + TranxNode.
> +
> + BLOCK_TRANX_NODES
> + The number of TranxNodes which are in a block.
Maybe move the definition of Block before this one, so that
BLOCK_TRANX_NODES can be defined on something that is
already known.
> + Block
> + A sequence memory which contain BLOCK_TRANX_NODES TranxNodes.
> + The Block link table is composed of number of Blocks.
> + last_node
> + it always points to the last node in current_block which is in use.
Same with current_block and last_node.
> + current_block
> + Part of its nodes are free and others are used. The Blocks before it are
> + all in use and the blocks after it are all free.
> + allocate_node
> + The pointer of the first free node in current_block is returned to the
> + caller. current_block will move to the next free block when all nodes of
> + it are in use. A new block is allocated and is put into the rear of the
> + Block link table if no block is free. At first, there is no any Block in
> + the Block link table. A Block is allocated and is put into the Block link
> + table;
suggestion: Replace: "At first ..." with "The list starts up empty
(ie, there is no allocated Block)."
> + free_nodes_before(TranxNode *node)
> + All blocks before the block which owneres the node gaving by the caller are
typo: owneres, gaving
> + put into the rear of the Block link table. The block containing
> + the node gaving by the caller is not move to the rear of the Block link
> + table, for some nodes of its are still in use. We only say a block is free
> + if all its node are free. This will waste at most one block, but it is more
> + efficient.
OK, I'm having a hard time to read this. Can you make it simpler?
Can you explain more ? I think I get the picture, but we need to
make this comment more clear.
Suggestion: A block is said to be free if all its nodes are free. As
such, all blocks before the given one ("node") are moved
into the rear of the block link table. The block containing
"node", however, is not. It may still have some other nodes
in use. This will waste at most one block, but it is more
efficient.
> + */
> +#define BLOCK_TRANX_NODES 16
> +class TranxNodeAllocator
> +{
> +public:
> + /*
> + init_nodes
> + How many TranxNodes are usually need.
> + max_reserved_block_num
> + How many blocks are need to cantain the TranxNodes of init_nodes.
typo: cantain -> contain
> + Some free Blocks will be freed if there are too many Blocks in the
> + Block link table.
> + */
> + TranxNodeAllocator(uint init_nodes) :
> + max_reserved_block_num(init_nodes/BLOCK_TRANX_NODES +
> + (init_nodes % BLOCK_TRANX_NODES > 1 ? 2 : 1)),
> + last_node(-1), first_block(NULL), last_block(NULL),
> + current_block(NULL), block_num(0) {}
> +
> + ~TranxNodeAllocator()
> + {
> + Block *block= first_block;
> + while (block != NULL)
> + {
> + Block *next= block->next;
> + free_block(block);
> + block= next;
> + }
> + }
> +
> + TranxNode *allocate_node()
> + {
> + TranxNode *trx_node;
> + Block *block= current_block;
> +
> + if (last_node == BLOCK_TRANX_NODES-1)
> + {
> + current_block= current_block->next;
> + last_node= -1;
> + }
> +
> + if (current_block == NULL && allocate_block())
> + {
> + current_block= block;
> + if (current_block)
> + last_node= BLOCK_TRANX_NODES-1;
> + return NULL;
> + }
> +
> + trx_node= &(current_block->nodes[++last_node]);
> + trx_node->log_name_[0] = '\0';
> + trx_node->log_pos_= 0;
> + trx_node->next_= 0;
> + trx_node->hash_next_= 0;
> + return trx_node;
> + }
> +
> + int free_all_nodes()
> + {
> + current_block= first_block;
> + last_node= -1;
> + free_blocks();
> + return 0;
> + }
> +
> + int free_nodes_before(TranxNode* node)
> + {
> + Block *block;
> + Block *prev_block;
> +
> + block= first_block;
> + while (block != current_block->next)
> + {
Please, add some comments here, to describe the interaction between
the pointers.
> + if (&(block->nodes[0]) <= node &&
> &(block->nodes[BLOCK_TRANX_NODES]) >= node)
> + {
> + if (first_block != block)
> + {
> + last_block->next= first_block;
> + first_block= block;
> + last_block= prev_block;
> + last_block->next= NULL;
> + free_blocks();
> + }
> + return 0;
> + }
> + prev_block= block;
> + block= block->next;
> + }
> +
> + /* Node does not find should never happen */
> + DBUG_ASSERT(0);
> + return 1;
> + }
> +
> +private:
> + uint max_reserved_block_num;
> + int last_node;
> +
> + struct Block {
> + Block *next;
> + TranxNode nodes[BLOCK_TRANX_NODES];
> + };
> + Block *first_block;
> + Block *last_block;
> + Block *current_block;
> + uint block_num;
> +
> + int allocate_block()
> + {
> + Block *block= (Block *)my_malloc(sizeof(Block), MYF(0));
> + if (block)
> + {
> + block->next= NULL;
> +
> + if (first_block == NULL)
> + first_block= block;
> + else
> + last_block->next= block;
> +
> + /* New block always is put at the rear */
> + last_block= block;
> + /* New block always is the current_block */
> + current_block= block;
> + ++block_num;
> + return 0;
> + }
> + return 1;
> + }
> +
> + void free_block(Block *block)
> + {
> + my_free(block, MYF(0));
> + --block_num;
> + }
> +
> + void free_blocks()
> + {
> + if (current_block == NULL || current_block->next == NULL)
> + return;
> +
> + /* One free block is always kept behind the current block */
> + Block *block= current_block->next->next;
> + while (block_num > max_reserved_block_num && block != NULL)
> + {
> + Block *next= block->next;
> + free_block(block);
> + block= next;
> + }
> + current_block->next= block;
Perhaps this ^ is current_block->next->next?
> + if (block == NULL)
> + last_block= current_block;
> + }
> +};
> +
> +
> /**
> This class manages memory for active transaction list.
>
> @@ -31,13 +221,8 @@
> class ActiveTranx
> :public Trace {
> private:
> - struct TranxNode {
> - char log_name_[FN_REFLEN];
> - my_off_t log_pos_;
> - struct TranxNode *next_; /* the next node in the sorted list */
> - struct TranxNode *hash_next_; /* the next node during hash collision */
> - };
>
> + TranxNodeAllocator allocator_;
> /* These two record the active transaction list in sort order. */
> TranxNode *trx_front_, *trx_rear_;
>
> @@ -48,24 +233,22 @@ private:
>
> inline void assert_lock_owner();
>
> - inline TranxNode* alloc_tranx_node();
> -
> inline unsigned int calc_hash(const unsigned char *key,unsigned int length);
> unsigned int get_hash_value(const char *log_file_name, my_off_t log_file_pos);
>
> int compare(const char *log_file_name1, my_off_t log_file_pos1,
> - const TranxNode *node2) {
> + const TranxNode *node2) {
> return compare(log_file_name1, log_file_pos1,
> - node2->log_name_, node2->log_pos_);
> + node2->log_name_, node2->log_pos_);
> }
> int compare(const TranxNode *node1,
> - const char *log_file_name2, my_off_t log_file_pos2) {
> + const char *log_file_name2, my_off_t log_file_pos2) {
> return compare(node1->log_name_, node1->log_pos_,
> - log_file_name2, log_file_pos2);
> + log_file_name2, log_file_pos2);
> }
> int compare(const TranxNode *node1, const TranxNode *node2) {
> return compare(node1->log_name_, node1->log_pos_,
> - node2->log_name_, node2->log_pos_);
> + node2->log_name_, node2->log_pos_);
> }
>
> public:
> @@ -88,7 +271,7 @@ public:
> * 0: success; non-zero: error
> */
> int clear_active_tranx_nodes(const char *log_file_name,
> - my_off_t log_file_pos);
> + my_off_t log_file_pos);
>
> /* Given a position, check to see whether the position is an active
> * transaction's ending position by probing the hash table.
> @@ -99,7 +282,7 @@ public:
> * (file_name, file_position).
> */
> static int compare(const char *log_file_name1, my_off_t log_file_pos1,
> - const char *log_file_name2, my_off_t log_file_pos2);
> + const char *log_file_name2, my_off_t log_file_pos2);
>
> };
>
>
> === modified file 'sql/rpl_handler.cc'
> --- a/sql/rpl_handler.cc 2009-12-04 01:46:33 +0000
> +++ b/sql/rpl_handler.cc 2010-01-27 07:48:44 +0000
> @@ -190,8 +190,8 @@ int Trans_delegate::after_commit(THD *th
> {
> Trans_param param;
> bool is_real_trans= (all || thd->transaction.all.ha_list == 0);
> - if (is_real_trans)
> - param.flags |= TRANS_IS_REAL_TRANS;
> +
> + param.flags = is_real_trans ? TRANS_IS_REAL_TRANS : 0;
See S1. for this ^ .
> Trans_binlog_info *log_info=
> my_pthread_getspecific_ptr(Trans_binlog_info*, RPL_TRANS_BINLOG_INFO);
> @@ -218,8 +218,8 @@ int Trans_delegate::after_rollback(THD *
> {
> Trans_param param;
> bool is_real_trans= (all || thd->transaction.all.ha_list == 0);
> - if (is_real_trans)
> - param.flags |= TRANS_IS_REAL_TRANS;
> +
> + param.flags = is_real_trans ? TRANS_IS_REAL_TRANS : 0;
>
> Trans_binlog_info *log_info=
> my_pthread_getspecific_ptr(Trans_binlog_info*, RPL_TRANS_BINLOG_INFO);
> @@ -228,7 +228,7 @@ int Trans_delegate::after_rollback(THD *
> param.log_pos= log_info ? log_info->log_pos : 0;
>
> int ret= 0;
> - FOREACH_OBSERVER(ret, after_commit, thd, (¶m));
> + FOREACH_OBSERVER(ret, after_rollback, thd, (¶m));
>
> /*
> This is the end of a real transaction or autocommit statement, we
>