From: Lars-Erik Bjørk
Date: May 14 2009 3:30pm
Subject: Re: Some more on IndexWalkers
List-Archive: http://lists.mysql.com/falcon/735
Message-Id:
MIME-Version: 1.0
Content-Type: text/plain; delsp=yes; format=flowed; charset=ISO-8859-1
Content-Transfer-Encoding: QUOTED-PRINTABLE
Hi Kevin!
In the run that I looked at now, we miss 76 rows
We search several pages with no valid entries. This are
the last few pages and entries we search before we hit the
page with valid entries. The BUCKET_END node is not shown
here since it is not passes to getValidatedRecord, which is where
I have my debug printouts.
=2E..
Looking at record id 509 on page 173 -> key mismatch
Looking at record id 510 on page 173 -> key mismatch
Looking at record id 511 on page 173 -> key mismatch
Repositioning the indexwalker to page 141
Looking at record id 0 on page 141 -> key mismatch
Repositioning the indexwalker to page 169
(empty)
Repositioning the indexwalker to page 201
Looking at record id 0 on page 201 -> key mismatch
Looking at record id 1 on page 201 -> key mismatch
=2E..
Then we have a total of 76 key mismatches like this on page 201
before we start returning valid entries
In page 173 all entries have key
(gdb) x/bx key
0x30f2180: 0xc0
(gdb) x/bx key+1
0x30f2181: 0x55
(gdb) x/bx key+2
0x30f2182: 0xc0
The empty page, 169, which is the last page before the page
with valid entries, only contains BUCKET_END entry which has
the key value
(gdb) x/bx node.key
0x7f1238973f43: 0xc0
(gdb) x/bx node.key+1
0x7f1238973f44: 0x56
(gdb) x/bx node.key+3
0x7f1238973f45: 0x40
which in this case is not expanded since it is the only node
in the page.
The first node on page 201, which is a valid node has the key
value
(gdb) x/bx node.key
0x7f1238973f43: 0xc0
(gdb) x/bx node.key+1
0x7f1238973f44: 0x57
(gdb) x/bx node.key+2
0x7f1238973f45: 0x40
which is different from the BUCKET_END node in the previous page (pag=
e =20
169)
I am not sure if this scheme of having the last entry in the page
equal to the first entry in the next page is also working with empty =
=20
pages
(pages only containing BUCKET_END)?
Have a nice day :)
/Lars-Erik
On May 14, 2009, at 3:25 PM, Kevin Lewis wrote:
> Lars-Erik,
>
> So the first node on the page is out of order? The unexpanded node=
s =20
> after that would match if they were expanded? How can that be?
>
> Lars-Erik Bj=F8rk wrote:
>> Hi Ann, and the crew!
>> After pushing http://lists.mysql.com/commits/73707 for bug #43630
>> 'A SELECT using ORDER BY and LIMIT sometimes returns no rows'
>> I have been looking at the remaining cases where we are returned t=
oo
>> few, or no, rows when doing a LIMIT query. It seems that when we =
=20
>> find a node
>> with a recordNumber >=3D 0 as the first node on a page, we forget =
to =20
>> expand
>> the key. I believed that this was deliberate, as the key should =
=20
>> already be
>> expanded, being identical to the last entry on the previous page. =
=20
>> Looking at this
>> in the debugger however, this does not seem to be the case.
>> When we start searching the first page that actually contains vali=
d =20
>> entries,
>> IndexWalker::keyLength and IndexWalker::key has the following valu=
es:
>> (gdb) p keyLength
>> $4 =3D 2
>> (gdb) x/bt key
>> 0x7fab4cfaf810: 10111111
>> (gdb) x/bt key+1
>> 0x7fab4cfaf811: 11110000
>> whereas the index node we are looking at has the following:
>> (gdb) p node
>> $2 =3D {node =3D 0x7fab4cfa7750, nextNode =3D 0x7fab4cfa7755, offs=
et =3D 0, =20
>> length =3D 1, key =3D 0x7fab4cfa7754 "\300\001", number =3D 329}
>> (gdb) p node.length
>> $3 =3D 1
>> (gdb) x/bt node.key
>> 0x7fab4cfa7754: 11000000
>> The keys and keylengths do not match. The recordKey created from t=
he
>> record when comparing in IndexWalker::getValidatedRecord, however,=
=20
>> looks
>> the same as the node:
>> (gdb) p recordKey.keyLength
>> $5 =3D 1
>> (gdb) x/bt recordKey.key
>> 0x7fab3fcb88e8: 11000000
>> Since all the fields have the same value, expanding the key for th=
e =20
>> following
>> nodes is a no-op, because the offset will be 1 and the length 0. =
=20
>> Therefore we
>> miss a lot of rows until we reach a node with offset 0 and length =
=20
>> 1. This is
>> maybe a supernode, since it seems to be fully expanded?
>> If I add the following line of code:
>> int32 WalkIndex::getNextNode(void)
>> {
>> for (;; first =3D true)
>> {
>> if (first)
>> {
>> first =3D false;
>> recordNumber =3D node.getNumber();
>> if (recordNumber >=3D 0)
>> {
>> node.expandKey(&indexKey); // <=3D=3D=3D=3D This =
line
>> return recordNumber;
>> }
>> we only experience some cases where we miss 1 row, opposed to
>> often missing a lot. We still have the other problem with too many=
=20
>> rows though :)
>> Is it correct that we miss to expand the key here, or is it =20
>> something else that is fishy?
>> (I know there is fishy stuff here, but hopefully not regarding thi=
s =20
>> particular issue :) )
>> For the curious, a little information about the nodes searched is =
=20
>> presented below:
>> Page =3D page number
>> dup =3D number of entries reported as duplicates
>> nrec =3D number of entries that have no matching record
>> ncan =3D number of entries that had no visible records
>> nmis =3D number of entries who's key does not match with the key =
=20
>> generated from the record
>> nret =3D number of entries returned as valid entries
>> This is presented in search order
>> Page:137, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 35, nret:0
>> Page:138, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1000, nret:0
>> Page:181, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:175, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 206, nret:465 (valid)
>> Page:136, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 515, nret:0
>> Page:219, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:147, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:174, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 2, nret:329 (valid)
>> Page:199, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:173, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:214, dup:1, nrec:0, ncan:0, nmis:=
=20
>> 451, nret:0
>> Page:149, dup:1, nrec:0, ncan:0, nmis:=
=20
>> 459, nret:0
>> Page:139, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:156, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:185, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:159, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:229, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:164, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:165, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:186, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:191, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:228, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 91, nret:0
>> Page:230, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:234, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:235, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:158, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:160, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:155, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:151, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:162, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:236, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:237, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:161, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 23, nret:0
>> Page:143, dup:1, nrec:0, ncan:0, nmis:=
=20
>> 506, nret:0
>> Page:140, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 1, nret:0
>> Page:142, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 24, nret:0
>> Page:141, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Page:218, dup:0, nrec:0, ncan:0, nmis:=
=20
>> 0, nret:0 (empty)
>> Number of empty pages: 11
>> Number of pages with vaild entries: 2
>> Number of pages processed: 38
>> This query returned 794 rows, while 1000 rows were expected
>> /Lars-Erik
>
> --=20
> Falcon Storage Engine Mailing List
> For list archives: http://lists.mysql.com/falcon
> To unsubscribe: http://lists.mysql.com/falcon?unsub=3Dlars-erik.=
bjork@stripped
>