Thanks for your reply. See comments inline.
He Zhenxing wrote:
>> (S2) I think the "Result" class is not needed, it is not used in an
>> object-like manner: we only create one object instance and call one
>> method. So it works more like a function. I would therefore suggest to
>> simply use functions instead. I think we can use
>> bitmap_is_set/bitmap_union/bitmap_set instead of the wrapper functions.
> The class Result and the wrapper functions are there to make the code
> neat, easier to understand and maintain. Without the class Result, we
> will have to add some global variables (table, const_bitmap,
> ordered_bitmap) or passing the parameters around the functions. Using
> the wrapper function makes the purpose of the code more clear, and also
> we can easily change there implementation (if we choose something better
> than bitmap).
> And if we will support multi-table, we will have multiple instances of
> this class.
OK, fair enough :-)
>> (S4) Even when there is a WHERE pk_col = constant clause, it's possible
>> to have more than one matching rows in some cases. For example, every
>> number has many string representations:
>> CREATE TABLE t1 (a CHAR(10) PRIMARY KEY);
>> CREATE TABLE t2 LIKE t1;
>> INSERT INTO t2 VALUES (''),('0'),('0.'),('.0'),('0.0'),('00');
>> INSERT INTO t1 SELECT * FROM t2 WHERE a = 0 LIMIT 1;
>> The last statement is actually unsafe, any of the six rows could match.
>> I don't know if it would be enough to check that the constant is of the
>> same type as the column. It's probably best to ask the runtime team
>> about that.
> Right, It looks it's not enough to check that they are the same type, I
> found the following case in check_simple_equality().
> if (const_item &&
> field_item->result_type() == const_item->result_type())
> if (field_item->result_type() == STRING_RESULT)
> CHARSET_INFO *cs= ((Field_str*) field_item->field)->charset();
> if ((cs != ((Item_func *) item)->compare_collation()) ||
> !cs->coll->propagate(cs, 0, 0))
> return FALSE;
> I think we should to do the same, for STRINGs, the collation of the
> field should be used for comparison.
>> (S7) A table can be order-deterministic if there is a PK where some
>> components are ordered by ORDER BY and some are unique because of WHERE
>> col = constant clauses. For example, the following is
>> order-deterministic, but the current algorithm marks it as unsafe:
>> CREATE TABLE t1 (a INT, b INT, PRIMARY KEY(a, b));
>> CREATE TABLE t2 LIKE t1;
>> INSERT INTO t2 SELECT * FROM t1 WHERE a = 1 ORDER BY b;
>> See comments inline.
> Yes, I ignore this for simplicity, I think this use case is not very
> important, will handle this in the patch for trunk.
I don't think this is more complicated than the current implementation -
see my suggestion for (S6)+(S7) inline in previous email.
Also, I think there are quite natural cases where this is needed.
Imagine for instance a database of course schedules. There is a table
where each lecture has one row. There is a PK on (course_id,
lecture_time). Then, it seems quite natural to use
WHERE course_id=X ORDER BY lecture_time LIMIT page_size.
>> ==== Other notes ====
>> (S8) I think you use the term "order deterministic" to denote that the
>> rows are in a deterministic order, in all places except
>> is_unique_key_const_ordered. Also, this doesn't check if a key is order
>> deterministic, it checks if a table is order deterministic. So I think
>> we can rename it to is_table_order_deterministic. See comments inline.
> OK, I think I can remove the function, and move the code to