List:Falcon Storage Engine« Previous MessageNext Message »
From:Vladislav Vaintroub Date:February 18 2009 3:43pm
Subject:RE: Patch for bug#42208
View as plain text  

> -----Original Message-----
> From: Kevin.Lewis@stripped [mailto:Kevin.Lewis@stripped]
> Sent: Wednesday, February 18, 2009 3:50 PM
> To: Vladislav Vaintroub
> Cc: 'Lars-Erik Bjørk'; 'Jim Starkey'; 'FalconDev'
> Subject: Re: Patch for bug#42208
> 
> Vlad,
> 
> I guess I did not understand the full implementation you described
> below.  Are you proposing a new way of storing  multi-segmented keys?
> Is this new method still searchable and comparable byte-wise left to
> right without knowledge of the key type?
> 
> Maybe you can explain this idea a little more thoroughly.

Ok, I will try with an example.

Again, fields are A and B and we're constructing multisegment key

I will use (stringA, stringB) notation for tuple of A and B 

Example 1:

 ('a','b')  would transform to   a 0 b
 ('ab','c')                 to   a b 0 c
 ('a', 'bc')                to   a 0 b c

Natural sorting order            ('a','b') < ('a', 'bc') < ('ab','c') 
corresponds to key sorting order  a 0 b    < a 0 b c     < a b 0 c

Example 2:

Now with special values 0 and 1 that translate to 0x0100 and 0x0101
correspondingly
     
('a\0b' , '\1') => a 1 0 b 0 1 1
('a', '\0b\1')  => a 0 1 0 b 1 1
Also here the natural order is preserved.

Example 3:

('a', NULL, 'b') => a 0 0 b   ( a  0<zero length key>0 b )
('a', 'b', NULL) => a 0 b 0   (last 0s can be stripped for compression)
Also here natural order is preserved.

Also note, that it is simple  to parse key and restore individual values.
Read byte-by-byte, skip leading 0x01 (it denotes 2 byte sequence) until you
get 0.

Parsing first  part of  a 1 0 b 0 1 1 
Would be 
- Read 'a'
- Skip 1
- Read 0
- Read 'b'
- Stop (found 0)


Please ask if you need more examples or I could not explain the encoding
clearly.


Vlad.







> Vladislav Vaintroub wrote:
> > Anyone wants to comment on that?
> > The change would be cheap and fixes at least current multisegment
> padding
> > problem, where 0x00==0x0000==0x000000...
> >
> >> -----Original Message-----
> >> From: Vladislav.Vaintroub@stripped
> [mailto:Vladislav.Vaintroub@stripped]
> >> On Behalf Of Vladislav Vaintroub
> >> Sent: Wednesday, February 18, 2009 12:45 AM
> >> To: Kevin.Lewis@stripped; 'Lars-Erik Bjørk'
> >> Cc: 'Jim Starkey'; 'FalconDev'
> >> Subject: RE: Patch for bug#42208
> >>
> >> Guys, why we're still on alpha stage, maybe we could fix
> multisegments
> >> too?
> >>
> >> Many (i.e 7 or 8) years  ago we used following schema for
> multisegment
> >> keys
> >> without padding
> >>
> >> Suppose we have keys A and B and want to make a multisegment key out
> of
> >> it.
> >>
> >> The resulting key would be
> >> f(A) 0x00 f(B)
> >>
> >> 0x00 serves as separator and f() is a transformation that converts
> >>
> >>   0x00=>0x01 0x00
> >>   0x01=>0x01 0x01
> >>
> >> Any other byte remains unchanged.0x00s at the end can be compressed,
> so
> >> we
> >> get efficient key is there are only/many NULLs.
> >>
> >> I do not think the schema is much more complicated than RUN length
> and
> >> padding.
> >>
> >> Vlad
> >>
> >> And while we're on it why not to fix integer representation;)
> Doubles
> >> are
> >> strange creation by mathematician, exact longlongs would be really
> >> nice,
> >> not?
> >>
> >>
> >>
> >>
> >>> -----Original Message-----
> >>> From: Kevin.Lewis@stripped [mailto:Kevin.Lewis@stripped]
> >>> Sent: Tuesday, February 17, 2009 11:55 PM
> >>> To: Lars-Erik Bjørk
> >>> Cc: Jim Starkey; Vladislav Vaintroub; 'FalconDev'
> >>> Subject: Re: Patch for bug#42208
> >>>
> >>> According to the blog link forwearded by Mark, Oracle customers don
> >> not
> >>> like that zero length strings (which ar equal to each other) are
> >>> automatically converted to NULLs.  Both suggestions take care of
> that
> >>> in
> >>> Falcon.  So this is the most inportant thing; make a zero length
> >> string
> >>> equal to 0x00 length 1.
> >>>
> >>> The question is whether to keep adding 0x00 to other lengths of
> >> binary
> >>> zero strings.  Jim says it does not matter to anyone but QA that
> 0x00
> >>> and 0x0000 sort separately.  And Vlad points out that even if we
> did
> >>> this for single field keys, it would not sort them differently for
> >>> multisegment key since we always pad them to a RUN length.  I think
> >>> that
> >>> if it does not cause any extra difficulties or comlexity in the
> code,
> >>> why not keep QA happy for single segment keys.
> >>>
> >>> And I still am unclear why this little change in index order should
> >>> cause us to change the ODS format while still in the alpha stage.
> >> What
> >>> is the downside of a new engine that starts converting zero length
> >>> strings into 0x00?  New entries will be added to the index after
> the
> >>> NULLS.  Older zero length strings would be mixed up with the NULLS
> >> and
> >>> may not be found for direct searched until the index is rebuilt.
> We
> >>> can
> >>> document that as a bug fix in the index, which it is. Nobodies
> >> critical
> >>> data is depending on us finding all zero length strings.
> >>>
> >>> Kevin
> >>>
> >>> Lars-Erik Bjørk wrote:
> >>>> Ok, so we probably don't want to do the caching after all then?
> >> Does
> >>>> anyone else have an opinion on how to proceed on this? Do we agree
> >> on
> >>>> any best approach?
> >>>>
> >>>> /Lars-Erik
> >>>>
> >>>> Jim Starkey wrote:
> >>>>> Vladislav Vaintroub wrote:
> >>>>>> Hi Lars-Erik,
> >>>>>> I wonder if adding 0x00 to the (binary) string values that
> >> already
> >>> start
> >>>>>> with 0x00 would not be less works that modifying index
> walker
> >> etc.
> >>> This
> >>>>>> looks like huge amount of work you have done (good) but I
> wonder
> >> if
> >>>>>> there is
> >>>>>> a good reason for it. Assuming (binary) strings that start
> with
> >>> 0x00 are
> >>>>>> really seldom, prepending 0x00 to a key after a check is
> not
> >> going
> >>> to
> >>>>>> be an
> >>>>>> expensive operation. And that makes NULL *really* different
> from
> >>>>>> other index
> >>>>>> values. And that allows maybe in some distant future
> index-only
> >>>>>> access, so
> >>>>>> you can answer "is null/is not null" without extra accessing
> the
> >>>>>> record and
> >>>>>> this is a real performance advantage.
> >>>>>>
> >>>>>>
> >>>>> Why do you want to do that?  Is the following sufficient:
> >>>>>
> >>>>>   1. A null is represented as either a zero length key or a
> >> missing
> >>>>>      segment in a multi-segment key.  This collates lowest.
> >>>>>   2. A zero length binary key is represented by a single byte
> of
> >>> zero.
> >>>>>   3. A binary key with a single zero byte is indistinquishable
> >> from
> >>> a
> >>>>>      zero length (but non-null) key
> >>>>>   4. A binary key with a leading zero byte and a subsequent
> non-
> >> zero
> >>>>>      byte will collate about #2 and #3.
> >>>>>
> >>>>> I don't think we really care about the ordering of a non-null,
> >> zero
> >>>>> length key and and all zero binary key.  I don't think anyone
> else
> >>>>> should, either.
> >>>>>
> >>>>
> >>
> >> --
> >> Falcon Storage Engine Mailing List
> >> For list archives: http://lists.mysql.com/falcon
> >> To unsubscribe:    http://lists.mysql.com/falcon?unsub=1
> >
> >
> >
> > --
> > Falcon Storage Engine Mailing List
> > For list archives: http://lists.mysql.com/falcon
> > To unsubscribe:
> http://lists.mysql.com/falcon?unsub=1
> >
> >

Thread
Patch for bug#42208Lars-Erik Bjørk16 Feb
  • RE: Patch for bug#42208Vladislav Vaintroub16 Feb
    • RE: Patch for bug#42208Vladislav Vaintroub16 Feb
      • Re: Patch for bug#42208Jim Starkey16 Feb
        • RE: Patch for bug#42208Vladislav Vaintroub16 Feb
    • Re: Patch for bug#42208Jim Starkey16 Feb
      • RE: Patch for bug#42208Vladislav Vaintroub16 Feb
      • Re: Patch for bug#42208Lars-Erik Bjørk17 Feb
        • Re: Patch for bug#42208Kevin Lewis17 Feb
          • Re: Patch for bug#42208Ann W. Harrison18 Feb
            • Re: Patch for bug#42208Ann W. Harrison18 Feb
              • Re: Patch for bug#42208Ann W. Harrison18 Feb
                • Re: Patch for bug#42208Kevin Lewis18 Feb
                  • Re: Patch for bug#42208Ann W. Harrison18 Feb
                    • Re: Patch for bug#42208Kevin Lewis18 Feb
          • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
            • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
              • Re: Patch for bug#42208Kevin Lewis18 Feb
                • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
                  • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
              • Re: Patch for bug#42208Jim Starkey18 Feb
                • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
                  • Re: Patch for bug#42208Jim Starkey18 Feb
                    • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
                      • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
            • Re: Patch for bug#42208Ann W. Harrison18 Feb
              • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
                • Re: Patch for bug#42208Jim Starkey18 Feb
                  • Re: Patch for bug#42208Ann W. Harrison18 Feb
                    • Re: Patch for bug#42208Ann W. Harrison18 Feb
                • Re: Patch for bug#42208Ann W. Harrison18 Feb
                  • RE: Patch for bug#42208Vladislav Vaintroub18 Feb
                    • Re: Patch for bug#42208Ann W. Harrison18 Feb
                    • Re: Patch for bug#42208Kevin Lewis18 Feb
                      • Re: Patch for bug#42208Ann W. Harrison18 Feb
      • Re: Patch for bug#42208MARK CALLAGHAN17 Feb
  • Re: Patch for bug#42208Jim Starkey16 Feb