List:Falcon Storage Engine« Previous MessageNext Message »
From:Alexander Barkov Date:December 3 2008 10:08am
Subject:Re: Please review fox for bug#34479
View as plain text  
Hi Lars-Erik,

I'm adding Svoj in CC because I know he's fixing
a very similar bug at the moment.

I have taken a look into
http://lists.mysql.com/commits/60380
which I guess is an improved version of
http://lists.mysql.com/commits/60379
originally mentioned in your letter.

I noticed three things:

1.
Sorry, I'm not strong with Falcon internals,
so I don't know why you need to trim trailing minSortChar.
This makes MySQLCollation::compare() work differently from
how collation really works.

Can you please give some insight for this?


2.
Unfortunately, MySQLCollation::computeKeyLength() won't work
in some cases because it scans from the end of the string.

Some Asian character sets, like can have bytes in ASCII range
in the second byte of a multi-byte character.
For example, in SJIS:

#define issjishead(c) ((0x81<=(c) && (c)<=0x9f) || \
                        ((0xe0<=(c)) && (c)<=0xfc))
#define issjistail(c) ((0x40<=(c) && (c)<=0x7e) || \
                        (0x80<=(c) && (c)<=0xfc))

For example, if you scan from the end, you can never guess
what is byte 0x40:
- either a single byte character (@)
- or the second byte of a two-byte character.
The same problem exists for all bytes 0x40..0x7e.

So only scanning from the beginning guarantees ambiguity for
character sets like SJIS.



3.

int falcon_cs_is_binary (void *cs)
{
   return (0 == strcmp(((CHARSET_INFO*) cs)->name, "binary"));
}

You can modify the above code to check to this:

return ((CHARSET_INFO*) cs)->number == 63;




Lars-Erik Bjørk wrote:
> Hi Alexander (bar)!
> 
> I am cc'ing you on this one, in case you have an opinion on the
> matter :)
> 
> /Lars-Erik
> 
> On Tue, 2008-12-02 at 15:18 +0100, Lars-Erik Bjørk wrote:
>> On Tue, 2008-12-02 at 15:08 +0100, Vladislav Vaintroub wrote:
>>> Lars-Erik, while I admit that is it a good job, except for taking slightly
>>> non-MBCS compliant handling of minSortChar.
>>>
>> Ahem, yes :) I forgot this one
>>
>>> But, I do believe  there are already too many workarounds in MySQLCollation
>>> .
>>> The original problems are not documented anywhere  and comments are missing
>>> , as usual. Thus I think for sake of portability, readability and other
>>> abilities a rewrite of this stuff using Mr. MySQL-Unicode( aka Bar) as
>>> consultant would  be good. I bet almost all problems could be fixed using
>>> functions strings library already provides (this stuff might be not well
>>> documented as well, but it works) and there is no reasons to
>>> reinvent/reimplement stuff.
>>>
>> If Bar knows a better, and less complex way using the strings library,
>> then I am all for that! :)
>>
>>> -First, why do we need computeKeyLength at all? What does it work around? 
>>> -Which role does minSortChar play? Why are there attempts to trim it
>>> together with the blanks?
>>> -We don't we use charsetinfo->lengthsp() to trim blanks. Why not?
>>> -We use some ad-hoc technique to trim strings. Why?
>>> -why there are so many if (isBinary) in MySQLCollation?
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Lars-Erik.Bjork@stripped [mailto:Lars-Erik.Bjork@stripped]
>>>> Sent: Tuesday, December 02, 2008 2:21 PM
>>>> To: falcon@stripped
>>>> Subject: Please review fox for bug#34479
>>>>
>>>> The fix is available at:
>>>>
>>>> http://lists.mysql.com/commits/60379
>>>>
>>>> The function MySQLCollation::computeKeyLength(...) is no longer
>>>> inlined,
>>>> because it would clutter the header file. Also, some of you (Vlad? :))
>>>> may have problems with using a couple of the CHARSET_INFO struct's
>>>> members 'directly', but this is already done elsewhere :)
>>>>
>>>> Please enjoy! ;)
>>>>
>>>> Best regards,
>>>>
>>>> Lars-Erik
>>>>
>>>>
>>>> --
>>>> Falcon Storage Engine Mailing List
>>>> For list archives: http://lists.mysql.com/falcon
>>>> To unsubscribe:    http://lists.mysql.com/falcon?unsub=1
>>>
>>
> 

Thread
Please review fox for bug#34479Lars-Erik Bjørk2 Dec
  • RE: Please review fox for bug#34479Vladislav Vaintroub2 Dec
    • RE: Please review fox for bug#34479Lars-Erik Bjørk2 Dec
      • RE: Please review fox for bug#34479Lars-Erik Bjørk2 Dec
        • Re: Please review fox for bug#34479Alexander Barkov3 Dec
          • Re: Please review fox for bug#34479Lars-Erik Bjørk3 Dec
            • Re: Please review fox for bug#34479Kevin Lewis3 Dec
              • Re: Please review fox for bug#34479Ann W. Harrison3 Dec
    • Re: Please review fox for bug#34479Ann W. Harrison2 Dec
      • RE: Please review fox for bug#34479Vladislav Vaintroub2 Dec
Re: Please review fox for bug#34479Kevin Lewis3 Dec
  • Re: Please review fox for bug#34479Alexander Barkov3 Dec
    • Re: Please review fox for bug#34479Ann W. Harrison3 Dec
      • Re: Please review fox for bug#34479Lars-Erik Bjørk4 Dec
        • RE: Please review fox for bug#34479Vladislav Vaintroub4 Dec
        • Re: Please review fox for bug#34479Lars-Erik Bjørk4 Dec
          • RE: Please review fox for bug#34479Vladislav Vaintroub4 Dec
          • Re: Please review fox for bug#34479Ann W. Harrison4 Dec
        • Re: Please review fox for bug#34479Ann W. Harrison4 Dec
          • Re: Please review fox for bug#34479Lars-Erik Bjørk5 Dec
            • Re: Please review fox for bug#34479Lars-Erik Bjørk5 Dec
        • Re: Please review fox for bug#34479Alexander Barkov8 Dec
          • RE: Please review fox for bug#34479Vladislav Vaintroub8 Dec
            • Re: Please review fox for bug#34479Alexander Barkov8 Dec
            • Re: Please review fox for bug#34479Ann W. Harrison8 Dec
Re: Please review fox for bug#34479Ann W. Harrison4 Dec