List:Falcon Storage Engine« Previous MessageNext Message »
From:Ann W. Harrison Date:December 3 2008 5:45pm
Subject:Re: Please review fox for bug#34479
View as plain text  
Expanding on what Kevin Lewis wrote:

>  >> Bar wrote;
>>> Sorry, I'm not strong with Falcon internals,
>>> so I don't know why you need to trim trailing minSortChar.
>>> This makes MySQLCollation::compare() work differently from
>>> how collation really works.
>>>
>>> Can you please give some insight for this?
> 
>  > Lars-Erik Bjørk wrote:
>> I think I will pass that one on to somebody else:) Maybe you could
>> explain this briefly, Kevin?
> 
> The Falcon internal encoded record does not store trailing white space. 
>  Jim Starkey has declared many times that he is on a mission to replace 
> the use of char[anything], varchar[anything], etc with just 'string'. 
> Falcon does that internally.  I also see no reason to store what does 
> not matter.

To clarify slightly, Falcon removes trailing spaces from strings, it
does not remove trailing tab characters which are often called "white
space".

One problem with removing trailing spaces is that some values that
can appear in strings sort lower than the space character, and the SQL
standard says that in a string comparison the shorter string is to be
padded with spaces to the length of the longer string.  As long
as all strings are space padded to their full length, that doesn't
matter - the comparisons work naturally.

However, when comparing strings of different lengths, the right
answer is less obvious.  Logically it would seem that a two character 
string sorts lower than any three character string that starts with
the same two characters.  Logic, SQL, and mother of all character
sets, ASCII, aren't a good combination.

The correct order of these strings (in most collations) is...

    ab<0x0>
    ab<tab>
    ab
    aba

For a long time, Falcon got that wrong in indexes and considered
'ab' to be less than 'ab<0x0>'.  Recently, Kevin had a brilliant
idea.  When handling a string key value, both for storage and for
comparison, add an imaginary space to it.  There's no reason to
store the space as long as everyone behaves as if it were stored.

So now the values above are treated as if they were:

    ab<0x0><0x20>
    ab<tab><0x20>
    ab<0x20>
    aba<0x20>

which, however ugly, is at least clear.

Best,

Ann
Thread
Please review fox for bug#34479Lars-Erik Bjørk2 Dec
  • RE: Please review fox for bug#34479Vladislav Vaintroub2 Dec
    • RE: Please review fox for bug#34479Lars-Erik Bjørk2 Dec
      • RE: Please review fox for bug#34479Lars-Erik Bjørk2 Dec
        • Re: Please review fox for bug#34479Alexander Barkov3 Dec
          • Re: Please review fox for bug#34479Lars-Erik Bjørk3 Dec
            • Re: Please review fox for bug#34479Kevin Lewis3 Dec
              • Re: Please review fox for bug#34479Ann W. Harrison3 Dec
    • Re: Please review fox for bug#34479Ann W. Harrison2 Dec
      • RE: Please review fox for bug#34479Vladislav Vaintroub2 Dec
Re: Please review fox for bug#34479Kevin Lewis3 Dec
  • Re: Please review fox for bug#34479Alexander Barkov3 Dec
    • Re: Please review fox for bug#34479Ann W. Harrison3 Dec
      • Re: Please review fox for bug#34479Lars-Erik Bjørk4 Dec
        • RE: Please review fox for bug#34479Vladislav Vaintroub4 Dec
        • Re: Please review fox for bug#34479Lars-Erik Bjørk4 Dec
          • RE: Please review fox for bug#34479Vladislav Vaintroub4 Dec
          • Re: Please review fox for bug#34479Ann W. Harrison4 Dec
        • Re: Please review fox for bug#34479Ann W. Harrison4 Dec
          • Re: Please review fox for bug#34479Lars-Erik Bjørk5 Dec
            • Re: Please review fox for bug#34479Lars-Erik Bjørk5 Dec
        • Re: Please review fox for bug#34479Alexander Barkov8 Dec
          • RE: Please review fox for bug#34479Vladislav Vaintroub8 Dec
            • Re: Please review fox for bug#34479Alexander Barkov8 Dec
            • Re: Please review fox for bug#34479Ann W. Harrison8 Dec
Re: Please review fox for bug#34479Ann W. Harrison4 Dec