Anyone wants to comment on that?
The change would be cheap and fixes at least current multisegment padding
problem, where 0x00==0x0000==0x000000...
> -----Original Message-----
> From: Vladislav.Vaintroub@stripped [mailto:Vladislav.Vaintroub@stripped]
> On Behalf Of Vladislav Vaintroub
> Sent: Wednesday, February 18, 2009 12:45 AM
> To: Kevin.Lewis@stripped; 'Lars-Erik Bjørk'
> Cc: 'Jim Starkey'; 'FalconDev'
> Subject: RE: Patch for bug#42208
>
> Guys, why we're still on alpha stage, maybe we could fix multisegments
> too?
>
> Many (i.e 7 or 8) years ago we used following schema for multisegment
> keys
> without padding
>
> Suppose we have keys A and B and want to make a multisegment key out of
> it.
>
> The resulting key would be
> f(A) 0x00 f(B)
>
> 0x00 serves as separator and f() is a transformation that converts
>
> 0x00=>0x01 0x00
> 0x01=>0x01 0x01
>
> Any other byte remains unchanged.0x00s at the end can be compressed, so
> we
> get efficient key is there are only/many NULLs.
>
> I do not think the schema is much more complicated than RUN length and
> padding.
>
> Vlad
>
> And while we're on it why not to fix integer representation;) Doubles
> are
> strange creation by mathematician, exact longlongs would be really
> nice,
> not?
>
>
>
>
> > -----Original Message-----
> > From: Kevin.Lewis@stripped [mailto:Kevin.Lewis@stripped]
> > Sent: Tuesday, February 17, 2009 11:55 PM
> > To: Lars-Erik Bjørk
> > Cc: Jim Starkey; Vladislav Vaintroub; 'FalconDev'
> > Subject: Re: Patch for bug#42208
> >
> > According to the blog link forwearded by Mark, Oracle customers don
> not
> > like that zero length strings (which ar equal to each other) are
> > automatically converted to NULLs. Both suggestions take care of that
> > in
> > Falcon. So this is the most inportant thing; make a zero length
> string
> > equal to 0x00 length 1.
> >
> > The question is whether to keep adding 0x00 to other lengths of
> binary
> > zero strings. Jim says it does not matter to anyone but QA that 0x00
> > and 0x0000 sort separately. And Vlad points out that even if we did
> > this for single field keys, it would not sort them differently for
> > multisegment key since we always pad them to a RUN length. I think
> > that
> > if it does not cause any extra difficulties or comlexity in the code,
> > why not keep QA happy for single segment keys.
> >
> > And I still am unclear why this little change in index order should
> > cause us to change the ODS format while still in the alpha stage.
> What
> > is the downside of a new engine that starts converting zero length
> > strings into 0x00? New entries will be added to the index after the
> > NULLS. Older zero length strings would be mixed up with the NULLS
> and
> > may not be found for direct searched until the index is rebuilt. We
> > can
> > document that as a bug fix in the index, which it is. Nobodies
> critical
> > data is depending on us finding all zero length strings.
> >
> > Kevin
> >
> > Lars-Erik Bjørk wrote:
> > > Ok, so we probably don't want to do the caching after all then?
> Does
> > > anyone else have an opinion on how to proceed on this? Do we agree
> on
> > > any best approach?
> > >
> > > /Lars-Erik
> > >
> > > Jim Starkey wrote:
> > >> Vladislav Vaintroub wrote:
> > >>> Hi Lars-Erik,
> > >>> I wonder if adding 0x00 to the (binary) string values that
> already
> > start
> > >>> with 0x00 would not be less works that modifying index walker
> etc.
> > This
> > >>> looks like huge amount of work you have done (good) but I wonder
> if
> > >>> there is
> > >>> a good reason for it. Assuming (binary) strings that start with
> > 0x00 are
> > >>> really seldom, prepending 0x00 to a key after a check is not
> going
> > to
> > >>> be an
> > >>> expensive operation. And that makes NULL *really* different from
> > >>> other index
> > >>> values. And that allows maybe in some distant future index-only
> > >>> access, so
> > >>> you can answer "is null/is not null" without extra accessing the
> > >>> record and
> > >>> this is a real performance advantage.
> > >>>
> > >>>
> > >> Why do you want to do that? Is the following sufficient:
> > >>
> > >> 1. A null is represented as either a zero length key or a
> missing
> > >> segment in a multi-segment key. This collates lowest.
> > >> 2. A zero length binary key is represented by a single byte of
> > zero.
> > >> 3. A binary key with a single zero byte is indistinquishable
> from
> > a
> > >> zero length (but non-null) key
> > >> 4. A binary key with a leading zero byte and a subsequent non-
> zero
> > >> byte will collate about #2 and #3.
> > >>
> > >> I don't think we really care about the ordering of a non-null,
> zero
> > >> length key and and all zero binary key. I don't think anyone else
> > >> should, either.
> > >>
> > >
> > >
>
>
> --
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe: http://lists.mysql.com/falcon?unsub=1