List:Falcon Storage Engine« Previous MessageNext Message »
From:Vladislav Vaintroub Date:May 14 2009 4:42pm
Subject:RE: Some more on IndexWalkers
View as plain text  
Lars-Erik,
Bucket-end == first entry on the next page is created during index split (at
least splitIndexIndexPageEnd would do this , not sure about
splitIndexPageMiddle). But if you the "split value" (first index entry on
the next page), this "bucket end == first entry on the next page" no longer
holds.


> -----Original Message-----
> From: Lars-Erik.Bjork@stripped [mailto:Lars-Erik.Bjork@stripped]
> Sent: Thursday, May 14, 2009 5:31 PM
> To: Kevin Lewis
> Cc: FalconDev
> Subject: Re: Some more on IndexWalkers
> 
> Hi Kevin!
> 
> In the run that I looked at now, we miss 76 rows
> 
> We search several pages with no valid entries. This are
> the last few pages and entries we search before we hit the
> page with valid entries. The BUCKET_END node is not shown
> here since it is not passes to getValidatedRecord, which is where
> I have my debug printouts.
> 
> ...
> Looking at record id 509 on page 173 -> key mismatch
> Looking at record id 510 on page 173 -> key mismatch
> Looking at record id 511 on page 173 -> key mismatch
> 
> Repositioning the indexwalker to page 141
> Looking at record id 0 on page 141 -> key mismatch
> 
> Repositioning the indexwalker to page 169
> (empty)
> 
> Repositioning the indexwalker to page 201
> Looking at record id 0 on page 201 -> key mismatch
> Looking at record id 1 on page 201 -> key mismatch
> ...
> Then we have a total of 76 key mismatches like this on page 201
> before we start returning valid entries
> 
> 
> In page 173 all entries have key
> (gdb) x/bx key
> 0x30f2180:      0xc0
> (gdb) x/bx key+1
> 0x30f2181:      0x55
> (gdb) x/bx key+2
> 0x30f2182:      0xc0
> 
> The empty page, 169, which is the last page before the page
> with valid entries, only contains BUCKET_END entry which has
> the key value
> 
> (gdb) x/bx node.key
> 0x7f1238973f43: 0xc0
> (gdb) x/bx node.key+1
> 0x7f1238973f44: 0x56
> (gdb) x/bx node.key+3
> 0x7f1238973f45: 0x40
> 
> which in this case is not expanded since it is the only node
> in the page.
> 
> The first node on page 201, which is a valid node has the key
> value
> 
> (gdb) x/bx node.key
> 0x7f1238973f43: 0xc0
> (gdb) x/bx node.key+1
> 0x7f1238973f44: 0x57
> (gdb) x/bx node.key+2
> 0x7f1238973f45: 0x40
> 
> which is different from the BUCKET_END node in the previous page (page
> 169)
> 
> I am not sure if this scheme of having the last entry in the page
> equal to the first entry in the next page is also working with empty
> pages
> (pages only containing BUCKET_END)?
> 
> Have a nice day :)
> 
> /Lars-Erik
> 
> 
> 
> On May 14, 2009, at 3:25 PM, Kevin Lewis wrote:
> 
> > Lars-Erik,
> >
> > So the first node on the page is out of order?  The unexpanded nodes
> > after that would match if they were expanded?  How can that be?
> >
> > Lars-Erik Bjørk wrote:
> >> Hi Ann, and the crew!
> >> After pushing http://lists.mysql.com/commits/73707 for bug #43630
> >> 'A SELECT using ORDER BY and LIMIT sometimes returns no rows'
> >> I have been looking at the remaining cases where we are returned too
> >> few, or no, rows when doing a LIMIT query. It seems that when we
> >> find a node
> >> with a recordNumber >= 0 as the first node on a page, we forget to
> >> expand
> >> the key. I believed that this was deliberate, as the key should
> >> already be
> >> expanded, being identical to the last entry on the previous page.
> >> Looking at this
> >> in the debugger however, this does not seem to be the case.
> >> When we start searching the first page that actually contains valid
> >> entries,
> >> IndexWalker::keyLength and IndexWalker::key has the following
> values:
> >> (gdb) p keyLength
> >> $4 = 2
> >> (gdb) x/bt key
> >> 0x7fab4cfaf810: 10111111
> >> (gdb) x/bt key+1
> >> 0x7fab4cfaf811: 11110000
> >> whereas the index node we are looking at has the following:
> >> (gdb) p node
> >> $2 = {node = 0x7fab4cfa7750, nextNode = 0x7fab4cfa7755, offset = 0,
> >> length = 1, key = 0x7fab4cfa7754 "\300\001", number = 329}
> >> (gdb) p node.length
> >> $3 = 1
> >> (gdb) x/bt node.key
> >> 0x7fab4cfa7754: 11000000
> >> The keys and keylengths do not match. The recordKey created from the
> >> record when comparing in IndexWalker::getValidatedRecord, however,
> >> looks
> >> the same as the node:
> >> (gdb) p recordKey.keyLength
> >> $5 = 1
> >> (gdb) x/bt recordKey.key
> >> 0x7fab3fcb88e8: 11000000
> >> Since all the fields have the same value, expanding the key for the
> >> following
> >> nodes is a no-op, because the offset will be 1 and the length 0.
> >> Therefore we
> >> miss a lot of rows until we reach a node with offset 0 and length
> >> 1. This is
> >> maybe a supernode, since it seems to be fully expanded?
> >> If I add the following line of code:
> >> int32 WalkIndex::getNextNode(void)
> >> {
> >>   for (;; first = true)
> >>       {
> >>       if (first)
> >>           {
> >>           first = false;
> >>           recordNumber = node.getNumber();
> >>           if (recordNumber >= 0)
> >>               {
> >>                 node.expandKey(&indexKey);  // <==== This line
> >>                 return recordNumber;
> >>                 }
> >> we only experience some cases where we miss 1 row, opposed to
> >> often missing a lot. We still have the other problem with too many
> >> rows though :)
> >> Is it correct that we miss to expand the key here, or is it
> >> something else that is fishy?
> >> (I know there is fishy stuff here, but hopefully not regarding this
> >> particular issue :) )
> >> For the curious, a little information about the nodes searched is
> >> presented below:
> >> Page = page number
> >> dup = number of entries reported as duplicates
> >> nrec = number of entries that have no matching record
> >> ncan = number of entries that had no visible records
> >> nmis = number of entries who's key does not match with the key
> >> generated from the record
> >> nret = number of entries returned as valid entries
> >> This is presented in search order
> >> Page:137,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 35,        nret:0
> >> Page:138,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1000,      nret:0
> >> Page:181,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:175,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 206,       nret:465               (valid)
> >> Page:136,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 515,       nret:0
> >> Page:219,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:147,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:174,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 2,         nret:329               (valid)
> >> Page:199,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:173,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:214,        dup:1,  nrec:0,         ncan:0,         nmis:
> >> 451,       nret:0
> >> Page:149,        dup:1,  nrec:0,         ncan:0,         nmis:
> >> 459,       nret:0
> >> Page:139,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:156,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:185,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:159,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:229,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:164,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:165,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:186,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:191,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:228,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 91,        nret:0
> >> Page:230,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:234,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:235,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:158,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:160,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:155,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:151,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:162,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:236,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:237,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:161,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 23,        nret:0
> >> Page:143,        dup:1,  nrec:0,         ncan:0,         nmis:
> >> 506,       nret:0
> >> Page:140,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 1,         nret:0
> >> Page:142,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 24,        nret:0
> >> Page:141,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Page:218,        dup:0,  nrec:0,         ncan:0,         nmis:
> >> 0,         nret:0         (empty)
> >> Number of empty pages: 11
> >> Number of pages with vaild entries: 2
> >> Number of pages processed: 38
> >> This query returned 794 rows, while 1000 rows were expected
> >> /Lars-Erik
> >
> > --
> > Falcon Storage Engine Mailing List
> > For list archives: http://lists.mysql.com/falcon
> > To unsubscribe:    http://lists.mysql.com/falcon?unsub=1
> erik.bjork@stripped
> >
> 
> 
> --
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe:    http://lists.mysql.com/falcon?unsub=1


Thread
Some more on IndexWalkersLars-Erik Bjørk14 May
  • Re: Some more on IndexWalkersKevin Lewis14 May
    • Re: Some more on IndexWalkersLars-Erik Bjørk14 May
      • RE: Some more on IndexWalkersVladislav Vaintroub14 May
  • Re: Some more on IndexWalkersAnn W. Harrison15 May
    • Re: Some more on IndexWalkersLars-Erik Bjørk18 May
    • Re: Some more on IndexWalkersLars-Erik Bjørk18 May
      • Re: Some more on IndexWalkersKevin Lewis18 May
        • Re: Some more on IndexWalkersLars-Erik Bjørk18 May
      • Re: Some more on IndexWalkersJim Starkey18 May
Re: Some more on IndexWalkersLars-Erik Bjørk14 May