Lars-Erik,
Bucket-end == first entry on the next page is created during index split (at
least splitIndexIndexPageEnd would do this , not sure about
splitIndexPageMiddle). But if you the "split value" (first index entry on
the next page), this "bucket end == first entry on the next page" no longer
holds.
> -----Original Message-----
> From: Lars-Erik.Bjork@stripped [mailto:Lars-Erik.Bjork@stripped]
> Sent: Thursday, May 14, 2009 5:31 PM
> To: Kevin Lewis
> Cc: FalconDev
> Subject: Re: Some more on IndexWalkers
>
> Hi Kevin!
>
> In the run that I looked at now, we miss 76 rows
>
> We search several pages with no valid entries. This are
> the last few pages and entries we search before we hit the
> page with valid entries. The BUCKET_END node is not shown
> here since it is not passes to getValidatedRecord, which is where
> I have my debug printouts.
>
> ...
> Looking at record id 509 on page 173 -> key mismatch
> Looking at record id 510 on page 173 -> key mismatch
> Looking at record id 511 on page 173 -> key mismatch
>
> Repositioning the indexwalker to page 141
> Looking at record id 0 on page 141 -> key mismatch
>
> Repositioning the indexwalker to page 169
> (empty)
>
> Repositioning the indexwalker to page 201
> Looking at record id 0 on page 201 -> key mismatch
> Looking at record id 1 on page 201 -> key mismatch
> ...
> Then we have a total of 76 key mismatches like this on page 201
> before we start returning valid entries
>
>
> In page 173 all entries have key
> (gdb) x/bx key
> 0x30f2180: 0xc0
> (gdb) x/bx key+1
> 0x30f2181: 0x55
> (gdb) x/bx key+2
> 0x30f2182: 0xc0
>
> The empty page, 169, which is the last page before the page
> with valid entries, only contains BUCKET_END entry which has
> the key value
>
> (gdb) x/bx node.key
> 0x7f1238973f43: 0xc0
> (gdb) x/bx node.key+1
> 0x7f1238973f44: 0x56
> (gdb) x/bx node.key+3
> 0x7f1238973f45: 0x40
>
> which in this case is not expanded since it is the only node
> in the page.
>
> The first node on page 201, which is a valid node has the key
> value
>
> (gdb) x/bx node.key
> 0x7f1238973f43: 0xc0
> (gdb) x/bx node.key+1
> 0x7f1238973f44: 0x57
> (gdb) x/bx node.key+2
> 0x7f1238973f45: 0x40
>
> which is different from the BUCKET_END node in the previous page (page
> 169)
>
> I am not sure if this scheme of having the last entry in the page
> equal to the first entry in the next page is also working with empty
> pages
> (pages only containing BUCKET_END)?
>
> Have a nice day :)
>
> /Lars-Erik
>
>
>
> On May 14, 2009, at 3:25 PM, Kevin Lewis wrote:
>
> > Lars-Erik,
> >
> > So the first node on the page is out of order? The unexpanded nodes
> > after that would match if they were expanded? How can that be?
> >
> > Lars-Erik Bjørk wrote:
> >> Hi Ann, and the crew!
> >> After pushing http://lists.mysql.com/commits/73707 for bug #43630
> >> 'A SELECT using ORDER BY and LIMIT sometimes returns no rows'
> >> I have been looking at the remaining cases where we are returned too
> >> few, or no, rows when doing a LIMIT query. It seems that when we
> >> find a node
> >> with a recordNumber >= 0 as the first node on a page, we forget to
> >> expand
> >> the key. I believed that this was deliberate, as the key should
> >> already be
> >> expanded, being identical to the last entry on the previous page.
> >> Looking at this
> >> in the debugger however, this does not seem to be the case.
> >> When we start searching the first page that actually contains valid
> >> entries,
> >> IndexWalker::keyLength and IndexWalker::key has the following
> values:
> >> (gdb) p keyLength
> >> $4 = 2
> >> (gdb) x/bt key
> >> 0x7fab4cfaf810: 10111111
> >> (gdb) x/bt key+1
> >> 0x7fab4cfaf811: 11110000
> >> whereas the index node we are looking at has the following:
> >> (gdb) p node
> >> $2 = {node = 0x7fab4cfa7750, nextNode = 0x7fab4cfa7755, offset = 0,
> >> length = 1, key = 0x7fab4cfa7754 "\300\001", number = 329}
> >> (gdb) p node.length
> >> $3 = 1
> >> (gdb) x/bt node.key
> >> 0x7fab4cfa7754: 11000000
> >> The keys and keylengths do not match. The recordKey created from the
> >> record when comparing in IndexWalker::getValidatedRecord, however,
> >> looks
> >> the same as the node:
> >> (gdb) p recordKey.keyLength
> >> $5 = 1
> >> (gdb) x/bt recordKey.key
> >> 0x7fab3fcb88e8: 11000000
> >> Since all the fields have the same value, expanding the key for the
> >> following
> >> nodes is a no-op, because the offset will be 1 and the length 0.
> >> Therefore we
> >> miss a lot of rows until we reach a node with offset 0 and length
> >> 1. This is
> >> maybe a supernode, since it seems to be fully expanded?
> >> If I add the following line of code:
> >> int32 WalkIndex::getNextNode(void)
> >> {
> >> for (;; first = true)
> >> {
> >> if (first)
> >> {
> >> first = false;
> >> recordNumber = node.getNumber();
> >> if (recordNumber >= 0)
> >> {
> >> node.expandKey(&indexKey); // <==== This line
> >> return recordNumber;
> >> }
> >> we only experience some cases where we miss 1 row, opposed to
> >> often missing a lot. We still have the other problem with too many
> >> rows though :)
> >> Is it correct that we miss to expand the key here, or is it
> >> something else that is fishy?
> >> (I know there is fishy stuff here, but hopefully not regarding this
> >> particular issue :) )
> >> For the curious, a little information about the nodes searched is
> >> presented below:
> >> Page = page number
> >> dup = number of entries reported as duplicates
> >> nrec = number of entries that have no matching record
> >> ncan = number of entries that had no visible records
> >> nmis = number of entries who's key does not match with the key
> >> generated from the record
> >> nret = number of entries returned as valid entries
> >> This is presented in search order
> >> Page:137, dup:0, nrec:0, ncan:0, nmis:
> >> 35, nret:0
> >> Page:138, dup:0, nrec:0, ncan:0, nmis:
> >> 1000, nret:0
> >> Page:181, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:175, dup:0, nrec:0, ncan:0, nmis:
> >> 206, nret:465 (valid)
> >> Page:136, dup:0, nrec:0, ncan:0, nmis:
> >> 515, nret:0
> >> Page:219, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:147, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:174, dup:0, nrec:0, ncan:0, nmis:
> >> 2, nret:329 (valid)
> >> Page:199, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:173, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:214, dup:1, nrec:0, ncan:0, nmis:
> >> 451, nret:0
> >> Page:149, dup:1, nrec:0, ncan:0, nmis:
> >> 459, nret:0
> >> Page:139, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:156, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:185, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:159, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:229, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:164, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:165, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:186, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:191, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:228, dup:0, nrec:0, ncan:0, nmis:
> >> 91, nret:0
> >> Page:230, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:234, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:235, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:158, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:160, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:155, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:151, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:162, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:236, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:237, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:161, dup:0, nrec:0, ncan:0, nmis:
> >> 23, nret:0
> >> Page:143, dup:1, nrec:0, ncan:0, nmis:
> >> 506, nret:0
> >> Page:140, dup:0, nrec:0, ncan:0, nmis:
> >> 1, nret:0
> >> Page:142, dup:0, nrec:0, ncan:0, nmis:
> >> 24, nret:0
> >> Page:141, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Page:218, dup:0, nrec:0, ncan:0, nmis:
> >> 0, nret:0 (empty)
> >> Number of empty pages: 11
> >> Number of pages with vaild entries: 2
> >> Number of pages processed: 38
> >> This query returned 794 rows, while 1000 rows were expected
> >> /Lars-Erik
> >
> > --
> > Falcon Storage Engine Mailing List
> > For list archives: http://lists.mysql.com/falcon
> > To unsubscribe: http://lists.mysql.com/falcon?unsub=1
> erik.bjork@stripped
> >
>
>
> --
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe: http://lists.mysql.com/falcon?unsub=1