List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:October 21 2008 2:56pm
Subject:bug#23692 Solutions
View as plain text  
 >Lars-Erik wrote;
> ---------------------------------------------------------
> 2. What do you plan to accomplish by next time we meet?
> ---------------------------------------------------------
> * Continue to work on, and hopefully have a fix for bug#34479 and bug#23692
>     - This includes chatting/mailing with Kevin about the possible solutions

Lars-Erik, Jim, Ann,

Last Friday in our meeting, Jim mentioned that we should compare char 
and varchar key values as if they have an imaginary trailing blank after 
them.  This will work only if it happens when adding keys to the index 
in addition to searching the index.  Also, it must happen when sorting 
and searching in DeferredIndexes.

For example, if these values are found in the index in the order of what 
is actually stored, they would  be (and currently are) like this;

notation  [this means the character is trimmed for the index]

0x 43
0x 43 [20]
0x 43 09
0x 43 43

Consider;  SELECT hex(a) FROM t1 WHERE a > 0x43;

If we implicitly add a 0x20 after the search key and the indexed key, 
then we might find the 0x43 values, but the next key value will be 43 
09, which is less than the current position.  In other words, it will 
seem like the index is out of order.

But if we sort the keysin the index as if they contained that trailing 
space, then they would look like this;

notation  {this means the character is assumed but missing}

0x 43 09
0x 43 {20}
0x 43 {[20]}
0x 43 43

Now the index keys are in SQL order and can correctly return both
       SELECT hex(a) FROM t1 WHERE a > 0x43;
and   SELECT hex(a) FROM t1 WHERE a < 0x43;

In Falcon, key comparison is done on a byte by byte manner independent 
of what type of key it is.  That is because all key values, whether 
number or string, are converted into a form that can be compared 
byte-wise left to right.  This is a great advantage when comparing 
multi-column indexes of various types because the code is much simpler. 
  The strict field type comparisons happen later when we are searching 
each record.  The solution Jim suggested Friday would necessitate that 
we determine where in a multicolumn key a char or varchar field starts 
and ends in order to know where to implicitly add a space character. 
Unless someone can correct my thinking here, I do not think that implied 
but not stored spaces in indexes is going to work for us.

The only solution to this dilema is to actually add the trailing space 
character to the index.

notation  (this means the character is added for correct sorting)

0x 43 09
0x 43 (20)
0x 43 20
0x 43 43

Note that the trainiling space is added only to the index values.  Not 
the record.

If we do this, I think it must be done as a selectable feature that can 
be turned off for 'better' performance.  And we need to see what that 
performance difference is.

Kevin
Thread
bug#23692 SolutionsKevin Lewis21 Oct
  • RE: bug#23692 SolutionsVladislav Vaintroub21 Oct
    • Re: bug#23692 SolutionsAnn W. Harrison21 Oct
    • Re: bug#23692 SolutionsKevin Lewis21 Oct
      • RE: bug#23692 SolutionsVladislav Vaintroub21 Oct