List:Falcon Storage Engine« Previous MessageNext Message »
From:Lars-Erik Bjørk Date:May 5 2009 10:29am
Subject:Re: Corrupt indexes?
View as plain text  
Lars-Erik Bjørk wrote:
> Hi Ann!
> Thank you for the detailed response
> I will have to reproduce the bug one more time to get answers to some 
> of your
> questions. I did some testing with rebuilding the index yesterday, so the
> corrupted database is no longer corrupt :)
> I have in-line answers to some of your questions, though
> Ann W. Harrison wrote:
>> Lars-Erik,
>>> Sometimes we get too many rows returned when doing LIMIT queries. 
>>> F.ex my previous RQG run says:
>>> SELECT * FROM `E` WHERE `int_key` < 5 returns different result when 
>>> executed with predicate 'ORDER BY `int_key` LIMIT 1073741824' (1000 
>>> vs. 1459 rows)
>>> The tables has 1000 rows.
>> That's not nice.
>>> In IndexWalker::getValidatedRecord, I have added debug printouts, 
>>> the recordNumber and the pageNumber whenever we return a  valid 
>>> record. For the previous example, there are plenty of index pages 
>>> used, but only the three first of them contains valid nodes.
>> That's interesting too.  If I understand correctly, only three pages
>> contain entries for the current content of the table.  What do the other
>> pages contain?  Old versions of records?  Garbage?  Nothing?

Other, not older, sorry :)

> So far I am not sure what the older pages contain. The debug printouts 
> I have
> is when we start a LIMIT search (IndexRootPage::positionIndex), when 
> we move
> on to the next index page (IndexRootPage::repositionIndex), and when 
> we return
> a valid record number from IndexWalker::getValidatedRecord. So the only
> information I have about the 'empty' pages right now is that I see 
> that we
> reposition to them, and then move along to the next page.
> Just fyi:
> A change that I have in my sandbox, is that if we come across 
> BUCKET_END as the
> first entry in a page, we move on to the next page instead of stopping 
> the search.
>> If there are three "leaf nodes" - level 0 pages - there should be one
>> mostly empty level 1 page for a total of four pages, unless the table
>> was once much larger, or had key values with a ranges that varied with
>> time.  If the key values are evenly distributed over time, then there
>> could be empty pages at the end, assuming that the key values got
>> smaller (lower value, not just shorter) over time.  Can you print out
>> an unused page to see what's in it?
> As soon as the bug is reproduced, the database remains in this state, and
> I can run the query multiple times and experience the same results every
> time. Getting the information you asked for should be doable. I will 
> respond
> to that in an additional mail when I have dug a little deeper.
>>> The three first pages are page 154, 136, 213 (in that order).
>>> All the nodes returned as valid from pages 154 and 213 match with 
>>> the duplicate rows returned. Page 136 contains valid nodes for all 
>>> 1000 records in the table.
>> One page contains all the legitimate values, and it's the middle of
>> the three.  Hmmm. Doesn't exactly sound like a failed split, but it
>> could be.  Are the entries all ordered correctly within the page?
>> Is there any overlap between page 154 and 213?  I'm assuming that
>> 154 and 213 contain about 500 entries.  Is page 136 full?
> What I forgot to mention yesterday, is that the value of the indexed 
> field
> is set to the same value for every record. Within the page, the nodes 
> are however
> orderer according to record number. The are no overlap of the entries 
> in page 154
> and 213. Page 154 has about 400 'valid' entries, whereas page 213 has 
> about 25. How
> many 'unvalid' entries they have, I do not know. I don't know at the 
> present time if
> page 136 is full.
>>> This leads me to believe that our index is fishy. Some of the nodes 
>>> have two entries in the index, and these two entries are not even 
>>> adjacent ...
>> Nasty.
>>> I assume that if this is the case, we would not notice it during a 
>>> regular index search (not using LIMIT), because this would only 
>>> result in the same bit in the bitmap being set twice, during the 
>>> scan phase. I tried to test this by adding the following to 
>>> IndexRootPage::scanIndex
>>>            ASSERT (!bitmap->isSet(number));  // Added this
>>>            bitmap->set (number);
>>> And this asserted during the test.
>> Good test.
>>> Does this mean anything? Or are there valid conditions where this 
>>> assert may happen? I think this looks suspect.
>> No there are no valid cases that I know of where a single value appears
>> twice with the same record number.  The record number is part of the
>> upper level key (as you know) to distinguish duplicate values from each
>> other - reduces the index garbage collection cost.  But in this case,
>> as Kevin pointed out, garbage collection isn't going to work.
> This seems correct, as the database is left in this corrupted state. 
> It is not a transient
> bug (unless the occurrence window is VERY big :)).
> I will get back to you with more detailed answers later today.
>> Interesting indeed.
>> Best regards,
>> Ann
> Best,
> Lars-Erik

Corrupt indexes?Lars-Erik Bjørk4 May
  • Re: Corrupt indexes?Kevin Lewis4 May
  • Re: Corrupt indexes?Ann W. Harrison4 May
    • Re: Corrupt indexes?Lars-Erik Bjørk5 May
      • Re: Corrupt indexes?Lars-Erik Bjørk5 May
        • Re: Corrupt indexes?Lars-Erik Bjørk5 May
      • Re: Corrupt indexes?Ann W. Harrison5 May