List:Internals« Previous MessageNext Message »
From:Sergei Golubchik Date:May 19 2009 2:56pm
Subject:Re: character fields in records_in_range
View as plain text  
Hi, Timothy!

On May 18, Timothy P Clark wrote:
> Sergei Golubchik <serg@stripped> wrote on 05/18/2009 04:34:35 AM:
> > On May 15, Timothy P Clark wrote:
> > > 
> > > We're trying to figure out the best way to determine a key field's
> > > non-padded length in records_in_range().
> > > 
> > > Our storage engine (IBM DB2) stores key values in a format that is
> > > different from MySQL's. One consequence of this is that we need to
> > > be able to determine in handler::records_in_range() the true
> > > length of a padded character field in a key.
> > 
> > Why do you need to do that ?
> The basic answer is because this is how DB2 requires us to construct the 
> structure that we pass to get the estimate required by records_in_range.

I don't understand that. To get the estimates for records_in_range you
need to pass in a range. Passing in a truncated value will generally
result in the incorrect range, won't it ?

> The pad values look like invalid data to the DB2 interface, so we have to 
> pass the length of the valid data.
> Furthermore, since the storage engine often performs character conversion 
> (ASCII<->EBCDIC or UTF8<->UCS2 or Multi-byte<->UCS2) between MySQL
> and 
> DB2, the pad character may end up with an entirely different codepoint 
> when passed to DB2, thus losing the min/max value property that has within 
> MySQL. So even if DB2 accepted the pad character, there is no guarantee 
> that it would have the same semantic as is intended for records_in_range.

If the collation in DB2 significantly different from what MySQL assumes
(which the the only explanation I can come up with for "invalid data"
and "losing the min/max value property"), then I suspect that
records_in_range() for LIKE is the least of your problems :)

You should have problems with ORDER BY and GROUP BY, unless you pretend
that your indexes are not ordered. You'll be able to do no ranges at
all.  Even equalities aren't guaranteed to work, I'm afraid.

Do ORDER BY and, say, BETWEEN ranges work ? If yes - LIKE should work
automatically. If they are not, I think that fixing LIKE is wrong, it's
just patching up the sympthoms.
Regards / Mit vielen Grüßen,

   __  ___     ___ ____  __
  /  |/  /_ __/ __/ __ \/ /   Sergei Golubchik <serg@stripped>
 / /|_/ / // /\ \/ /_/ / /__  Principal Software Engineer/Server Architect
/_/  /_/\_, /___/\___\_\___/  Sun Microsystems GmbH, HRB München 161028
       <___/                  Sonnenallee 1, 85551 Kirchheim-Heimstetten
Geschäftsführer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Häring
character fields in records_in_rangeTimothy P Clark15 May
  • Re: character fields in records_in_rangeSergei Golubchik18 May
    • Re: character fields in records_in_rangeTimothy P Clark18 May
      • Re: character fields in records_in_rangeSergei Golubchik19 May
        • Re: character fields in records_in_rangeTimothy P Clark20 May
          • Re: character fields in records_in_rangeSergei Golubchik20 May
            • Re: character fields in records_in_rangeTimothy P Clark21 May
  • re: character fields in records_in_rangeMichael Widenius19 May
    • re: character fields in records_in_rangeTimothy P Clark20 May
      • Re: re: character fields in records_in_rangeSergei Golubchik20 May
        • Re: re: character fields in records_in_rangeTimothy P Clark20 May
          • Re: re: character fields in records_in_rangeSergei Golubchik21 May
            • Re: re: character fields in records_in_rangeTimothy P Clark21 May