List:Falcon Storage Engine« Previous MessageNext Message »
From:Kevin Lewis Date:January 30 2009 7:52pm
Subject:Re: NULLs and stuff
View as plain text  
 > Lars-Erik Bjørk wrote:
> Hi all!
> 
> This week I have been looking at http://bugs.mysql.com/bug.php?id=42208 
> (Falcon's ORDER BY .. LIMIT gives wrong/inconsistent results on NULL 
> values).
> 
> The problem is that NULL values sort together with numeric zero. I have 
> talked this issue over with Vlad, and have tried several different 
> solutions to make it work for both single-field and multi-field indexes. 
> What seems to work, is for Index::makeKey(Field *field, Value *value, 
> int segment, IndexKey *indexKey) to return a zero length key for NULL 
> values, and to prepend 0x00 to all keys starting with 0x00 (to make NULL 
> sort before the empty (string), and to make the empty string, now 0x00, 
> sort before the previous 0x00, now 0x0000, etc, etc)

This seems to work for single segment keys.  These extended key values 
would be written into the index page, so we can add a new 
ODS_MINOR_VERSION4 and change;

- #define ODS_MINOR_VERSION	ODS_MINOR_VERSION3
+ #define ODS_MINOR_VERSION	ODS_MINOR_VERSION4

Then in Dbb::open(), this code would prevent this new engine from 
accessing any previously created tablespace.

	if (header.odsVersion > ODS_VERSION ||
		(header.odsVersion == ODS_VERSION && header.odsMinorVersion > 
ODS_MINOR_VERSION))
		throw SQLError (VERSION_ERROR, "Falcon on disk structure version %d.%d 
is not supported by version %d.%d server",
						header.odsVersion, header.odsMinorVersion, ODS_VERSION, 
ODS_MINOR_VERSION);

But the solution above does not address how null values are stored and 
compared in multiu-segmented keys.  The early segments are all padded 
out to the nearest RUN length so that NULLs and zero-length values look 
like 0x00, which also look like 0x0000 and 0x000000 and 0x00000000 and 
0x0000000000.  They all sort together.  There is nothing in the sorted 
RUN-length encoding which indicates the length of an early segment, is 
there?  This is an encoding which is sortable byte-by-byte while being 
agnostic of data type.

Someone said last week that there is a limit in MySQL of 32 key 
segments.  Our 1 byte 'segment number' at the start of each RUN is a 
one-base counter.  Maybe there is a way to combine 5 bits for 
segmentNumber (0 - 31) with 3 bits of actual field length within the RUN...


> Being in this area, I picked up http://bugs.mysql.com/bug.php?id=23692 
> again, which we have discussed on a previous meeting. The idea Kevin 
> proposed here (if I remember correctly) was to append 0x20 to all 
> upper-bound search keys ending in >= 0x20 (Kevin, please correct me if 
> my memory is corrupt). This means passing an extra argument all the way 
> from StorageTable::setIndexBounds down to Index::makeKey, because we 
> don't want to store the extra byte in the regular index keys.
> 
> How does the rest of you feel about these solutions? I get this hacky 
> feeling:) We have to make sure there is always room for a 0x00 and a 
> 0x20 in the key, meaning that the actual key can no longer be 
> MAX_PHYSICAL_KEY_LENGTH long, but two less, as well as [prep,app]ending 
> 0x00 and 0x20 being pretty magic.
> 
> Is this something that you are comfortable with? I don't have any 
> alternative solutions:)

We do not even have to change the ODS version to add an extra 0x20 to 
all string fields stored.  This would give the correct sorting, if the 
appending of the 0x20 is done for every string field that goes through 
makeKey.  But it would also create larger indexes since Falcon uses 
prefix compression only, not postfix compression (or whatever it would 
be called).

I think the idea that we are ruminating about is whether we can get away 
with adding the extra 0x20 only at comparison time, or immediately 
before that.  To do this, we need to know the field type.  But any 
search value or index value that comes out of makeKey is type agnostic. 
  It would be a big and difficult change, in my mind, if we tried to 
carry the field type metadata past makeKey into the world of falcon key 
comparison and index storage!

I do not know any other alternative to actually storing the extra 0x20 
in the index page.  A search key comes in from the server, goes through 
makeKey, and then is ready to compare to the contents of an index page 
or deferredIndex.  And it has now become field-type agnostic.

Kevin
Thread
NULLs and stuffLars-Erik Bjørk30 Jan
  • RE: NULLs and stuffVladislav Vaintroub30 Jan
    • Re: NULLs and stuffAnn W. Harrison30 Jan
  • Re: NULLs and stuffPhilip Stoev30 Jan
    • RE: NULLs and stuffVladislav Vaintroub30 Jan
      • RE: NULLs and stuffVladislav Vaintroub30 Jan
        • RE: NULLs and stuffHakan Kuecuekyilmaz30 Jan
          • RE: NULLs and stuffVladislav Vaintroub30 Jan
            • Re: NULLs and stuffJim Starkey1 Feb
      • Re: NULLs and stuffJim Starkey1 Feb
  • Re: NULLs and stuffKevin Lewis30 Jan
  • Re: NULLs and stuffAnn W. Harrison30 Jan
  • Re: NULLs and stuffJim Starkey1 Feb
    • Re: NULLs and stuffLars-Erik Bjørk2 Feb
      • Re: NULLs and stuffJim Starkey2 Feb
        • RE: NULLs and stuffVladislav Vaintroub2 Feb
          • Re: NULLs and stuffJim Starkey2 Feb
            • RE: NULLs and stuffVladislav Vaintroub2 Feb