From: He Zhenxing
Date: January 18 2011 2:52am
Subject: Re: bzr commit into mysql-5.1-bugteam branch (zhenxing.he:3418)
Bug#42415
List-Archive: http://lists.mysql.com/commits/129010
Message-Id: <1295319157.1630.14.camel@hezx-laptop>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
Sven Sandberg wrote:
> Hi Zhenxing,
>
> This is a partial review, let's discuss it more on IRC.
>
> This is a very difficult bug and you did a very detailed and analysis.
> You also implemented the complicated algorithm that solves it.
>
Thank you!
> There is still the the unresolved issue of how we can call fix_fields()
> before decide_logging_format(). I also have three suggestions to improve
> the implementation:
>
> (1) It seems you added a condition that marks the table as ordered
> when there is an ORDER BY on every column. I may be wrong but I think
> this is not correct.
>
I think you're right, case insensitive columns will be a problem.
> (2) I think you have implemented the algorithm I suggested in the bug
> report on 29 Oct. Please see my comment on 3 Nov: there is a more
> efficient algorithm.
>
> (3) This implementation does a lot of allocation. I think it's
> possible to optimize away that by using a more compact graph representation.
>
Agree, will try to work on your suggestions.
>
> On 12/09/2010 05:18 AM, He Zhenxing wrote:
> [...]
> > === modified file 'sql/sql_select.cc'
> > --- a/sql/sql_select.cc 2010-05-27 15:13:53 +0000
> > +++ b/sql/sql_select.cc 2010-12-09 04:18:42 +0000
> [...]
> > @@ -17162,6 +17178,457 @@ bool JOIN::change_result(select_result *
> [...]
> > +/*
> > + And-or graph truth propogation algorithm is used to calculate if the
> > + statement is ordered or not.
> > +
> > + Nodes:
> > + JoinNode - And node, this is the root, successors are a list of
> > + ConstOrderedTableNode.
> > + ConstOrderedTableNode - Or node, have two TableNode as successors,
> > + one for ordered table, and the other for constant table.
> > + TableNode - Or node, have a list of KeyNode and one AllColumnsNode
> > + as successors.
> > + KeyNode - And node, successors are a list of ColumnNode that corresponding
> > + to the fields of the key.
> > + AllColumnsNode - And node, successors are a list of ColumnNode of all
> > + fields of the table.
> > + ColumnNode - Or node, represent a field of the table, it will take a
> > + TableNode as a successor, which means if a table is true
> > + (const or ordered), the all its columns are const or ordered.
> > + Successors to other column nodes will be added mutually if
> > + there is an equation to that field in the condition. The
> > + column nodes for fields that are in the ORDER BY or are
> > + equal to constants are added to the init_nodes of the JoinNode,
> > + which is used as the initial true values of the propogation.
>
> (1) If I understand correctly, you have added the following condition
> (the AllColumnsNode):
>
> "If there are ORDER BY clauses that cover all columns, then the table is
> marked as ordered".
>
> However, I think this is not correct. ORDER BY clauses only make the
> table deterministic if there is a key. For example, a column may be
> ordered using a case-insensitive collation. An example:
>
> CREATE TABLE t1 (a TEXT COLLATE utf8_general_ci);
> CREATE TABLE t2 (a TEXT);
> INSERT INTO t1 VALUES ("a"), ("A");
> INSERT INTO t2 SELECT * FROM t1 LIMIT 1;
>
>
> [...]
> +bool JoinNode::is_ordered()
> +{
> + List nodes(init_nodes);
> + List_iterator it(nodes);
> + Node *node;
> +
> + while ((node= it++))
> + {
> + node->todo--;
> + if (node->todo == 0)
> + {
> + if (node == this)
> + return true;
> + List_iterator pit(node->predecessors);
> + Node *pnode;
> + while ((pnode= pit++))
> + {
> + if (pnode->todo)
> + nodes.push_back(pnode);
> + }
> + }
> + }
> + return false;
> +}
>
> (2) I think the above implements the algorithm from my comment on the
> bug on 29 Oct. Please see the comment from 3 Nov: I think that algorithm
> has better space requirements.
>
>
> (3) This implementation has to allocate two objects for every node in
> the graph (the Node object and a List object). I also think that
> the List class probably allocates one object per element; in that case
> there is also one allocated object per edge in the graph and one for
> each iteration of the algorithm. I think this may add both time and
> space overhead that is significant. I think we can reduce this to only 4
> allocations altogether, and at most 12 bytes per node + 4 bytes per
> edge. This should reduce both time and space requirements. Here's the idea:
>
> - Each node is identified by an integer. The root node is number 0.
>
> - The only data we need for each node is one integer: the "todo"
> field.
>
> - The graph structure can be represented using two integer lists. The
> first list contains all successor lists concatenated into one big
> flat list. The second list is indexed by node numbers and gives for
> each node the offset in the first list where the node's successors
> are stored.
>
> - The set S is represented as a list. Elements are added and removed
> from S in a stack-like way, i.e., pushed to the end when added and
> popped from the end when removed.
>
>
> So the following data types can be used:
>
> /*
> The number of nodes in the graph.
> */
> int n_nodes;
> /*
> The 'todo' array is indexed by node numbers. If N is a node, then
> todo[N] contains the number of successors of N that need to be
> marked deterministic before N is marked true.
> */
> int *todo;
> /*
> Lists the successors of all nodes. This is a big flat list:
> first all successors of node 0, then all successors of node 1,
> etc.
> */
> int *successor;
> /*
> If N is a node, then node_successor[N] contains the index into
> the 'successor' array where the first successor of N is stored.
> */
> int *node_successor;
> /*
> The 'dirty_nodes' array is a stack used during the evaluation
> of the and-or graph. It contains all nodes that have been marked
> as true where the truth value has not yet been propagated to
> the successors. 'n_dirty_nodes' is the number of elements
> in 'dirty_nodes'.
> */
> int n_dirty_nodes;
> int *dirty_nodes;
>
>
> The algorithm can then be implemented like:
>
> while (n_dirty_nodes)
> {
> int node= dirty_nodes[--n_dirty_nodes];
> int i_end= node_successor[node + 1];
> for (int i= node_successor[node]; i < i_end; i++) {
> int succ= successor[i];
> if (todo[succ]) {
> todo[succ]--;
> if (todo[succ] == 0) {
> if (succ == 0) // the root node
> return TRUE;
> dirty_nodes[n_dirty_nodes++]= succ;
> }
> }
> }
> }
> return FALSE;
>
>
> I think these data structures can be initialized as follows (this is
> just a sketch):
>
> 1. Count the number of nodes of each type.
> 2. Allocate todo (n_nodes elements), node_successor (n_nodes+1
> elements), and dirty_nodes (n_nodes elements).
> 3. Count the number of successors for each node. Store this temporary
> number in node_successor.
> 4. Allocate successor.
> 5. Initialize node_successor:
> int ofs= 0;
> for (int i= 0; i <= n_nodes; i++) {
> int n_succ= node_successor[i];
> node_successor[i]= ofs;
> ofs += n_succ;
> }
> 6. Fill successor with elements. During this algorithm, we need to
> store the number of successors of each node that have been added
> to the array. This number can be stored in dirty_nodes.
> 7. Initialize dirty_nodes.
>
> /Sven
>