Expanding on what Ann wrote:
Ann W. Harrison wrote:
> Expanding on what Kevin Lewis wrote:
>
>> >> Bar wrote;
>>>> Sorry, I'm not strong with Falcon internals,
>>>> so I don't know why you need to trim trailing minSortChar.
>>>> This makes MySQLCollation::compare() work differently from
>>>> how collation really works.
>>>>
>>>> Can you please give some insight for this?
>>
>> > Lars-Erik Bjørk wrote:
>>> I think I will pass that one on to somebody else:) Maybe you could
>>> explain this briefly, Kevin?
>>
>> The Falcon internal encoded record does not store trailing white
>> space. Jim Starkey has declared many times that he is on a mission to
>> replace the use of char[anything], varchar[anything], etc with just
>> 'string'. Falcon does that internally. I also see no reason to store
>> what does not matter.
>
> To clarify slightly, Falcon removes trailing spaces from strings, it
> does not remove trailing tab characters which are often called "white
> space".
>
> One problem with removing trailing spaces is that some values that
> can appear in strings sort lower than the space character, and the SQL
> standard says that in a string comparison the shorter string is to be
> padded with spaces to the length of the longer string. As long
> as all strings are space padded to their full length, that doesn't
> matter - the comparisons work naturally.
>
> However, when comparing strings of different lengths, the right
> answer is less obvious. Logically it would seem that a two character
> string sorts lower than any three character string that starts with
> the same two characters. Logic, SQL, and mother of all character
> sets, ASCII, aren't a good combination.
>
> The correct order of these strings (in most collations) is...
>
> ab<0x0>
> ab<tab>
> ab
> aba
>
> For a long time, Falcon got that wrong in indexes and considered
> 'ab' to be less than 'ab<0x0>'. Recently, Kevin had a brilliant
> idea. When handling a string key value, both for storage and for
> comparison, add an imaginary space to it. There's no reason to
> store the space as long as everyone behaves as if it were stored.
>
> So now the values above are treated as if they were:
>
> ab<0x0><0x20>
> ab<tab><0x20>
> ab<0x20>
> aba<0x20>
>
> which, however ugly, is at least clear.
Well this is not quite yet implemented. The bug is being worked on by
Lars-Erik and does not have a BETA tag. I think it is;
Bug #23692 Falcon: searches fail if data is 0x00