List:Internals« Previous MessageNext Message »
From:Timothy P Clark Date:May 21 2009 4:27pm
Subject:Re: character fields in records_in_range
View as plain text  
Hi Sergei,

Sergei Golubchik <serg@stripped> wrote on 05/20/2009 03:22:11 PM:

> Hi, Timothy!
> 
> On May 20, Timothy P Clark wrote:
> > > 
> > > You should have problems with ORDER BY and GROUP BY, unless you
> > > pretend that your indexes are not ordered. You'll be able to do no
> > > ranges at all.  Even equalities aren't guaranteed to work, I'm
> > > afraid.
> > > 
> > > Do ORDER BY and, say, BETWEEN ranges work ? If yes - LIKE should
> > > work automatically. If they are not, I think that fixing LIKE is
> > > wrong, it's just patching up the sympthoms.
> > Yes, these do work, but that is because the actual I/O operations 
(i.e. 
> > index_read) run through a much different set of interfaces than 
> > records_in_range.
> 
> You should be able to trigger records_in_range() also without LIKE -
> with BETWEEN or inequalities.

OK, right. I misunderstood your question to mean "do queries with 
BETWEEN/ORDER BY/GROUP BY/etc. return correct *results*?" Which they do, 
and for all character sets.

But I think that you're asking whether records_in_range gives reasonable 
*estimates* in those cases with the problematic character sets. I don't 
have a definitive answer, but in general I believe that it should. A basic 
SELECT ... WHERE col BETWEEN "AB" AND "CD" will obtain a good estimate 
from records_in_range(). That's because the key values are padded with 
blanks and contain entirely valid data. On the other hand, LIKE (and the 
equivalent BETWEEN) introduces the padding with min_sort_char and 
max_sort_char, and, as I've indicated, it's those specific values that can 
cause a problem.

However, I think I've got a reasonably simple solution to all of this, and 
it looks to be working with some preliminary tests. Instead of trying to 
calculate the non-padded length of the key and sending a single key into 
DB2, we'll honor the min/max format that MySQL is using. But when iconv 
encounters an invalid character, we'll just fill the remainder of the 
output buffer with the appropriate min/max value for the destination 
character set.
Thread
character fields in records_in_rangeTimothy P Clark15 May
  • Re: character fields in records_in_rangeSergei Golubchik18 May
    • Re: character fields in records_in_rangeTimothy P Clark18 May
      • Re: character fields in records_in_rangeSergei Golubchik19 May
        • Re: character fields in records_in_rangeTimothy P Clark20 May
          • Re: character fields in records_in_rangeSergei Golubchik20 May
            • Re: character fields in records_in_rangeTimothy P Clark21 May
  • re: character fields in records_in_rangeMichael Widenius19 May
    • re: character fields in records_in_rangeTimothy P Clark20 May
      • Re: re: character fields in records_in_rangeSergei Golubchik20 May
        • Re: re: character fields in records_in_rangeTimothy P Clark20 May
          • Re: re: character fields in records_in_rangeSergei Golubchik21 May
            • Re: re: character fields in records_in_rangeTimothy P Clark21 May