List:General Discussion« Previous MessageNext Message »
From:Johnny Withers Date:May 7 2010 3:13pm
Subject:Re: Where to index - over 15m records and growing
View as plain text  
You could be running into this:

http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html



On Fri, May 7, 2010 at 10:05 AM, Peter Brawley
<peter.brawley@stripped>wrote:

> something tells me I need to rethink this yes.
>>
>
> If you were to add a computed column yearmonth, you could write WHERE
> yearmonth=201004.
>
> PB
>
> -----
>
> Chris Knipe wrote:
>
>> On Fri, May 7, 2010 at 10:42 AM, Rob Wultsch <wultsch@stripped> wrote:
>>
>>
>>
>>> Second thing:
>>> How selective is
>>> "WHERE IVAOData.TrackerTime >=UNIX_TIMESTAMP('2010-04-01 00:00:00')
>>>       AND IVAOData.TrackerTime <=UNIX_TIMESTAMP('2010-04-30 23:59:50') "
>>>
>>> Test by running
>>> SELECT COUNT(*)
>>> FROM  IVAOData
>>> WHERE IVAOData.TrackerTime >=UNIX_TIMESTAMP('2010-04-01 00:00:00')
>>>       AND IVAOData.TrackerTime <=UNIX_TIMESTAMP('2010-04-30 23:59:50')
>>>
>>> If this is a large proportion of the row count then you are probably
>>> in store for pain. It sounds like you are matching half the table. Big
>>> (intermediate) result sets often end in pain.
>>>
>>>
>>>
>>
>> At this stage, you are correct.  We have roughly 2 months worth of data in
>> the table and are selecting about half (one months worth), thus about 50%.
>> With 6 months worth of data in the table and selecting one months worth of
>> data, that's roughly 16% of the data - but it will still be a bulk large
>> result... Hmmm, something tells me I need to rethink this yes.
>>
>>
>>
>>
>>
>>> Third thing:
>>> My (rather sleepy) gut thinks your best bet is a a composite index on
>>> the table IVAOData on the columns TrackerTime and FlightID. This will
>>> make all access to the table in this query hit a covering index.
>>>
>>>
>>>
>>
>> Took over 12 hours to create the index on TrackerTime, and you're right -
>> I
>> should have seen and realised this.  I will drop the index on TrackerTime
>> and re-create it using both colums as I should have done in the first
>> place.
>>
>>
>>
>>
>>
>>> Fourth thing:
>>> What do you intend to ask the database with this query. This query
>>> smells overly broad
>>>
>>>
>>
>>
>> The idea is to get a count of the number of entries from Dep to Des during
>> the last month.  I.E.  How many records are there where Dep and Des are
>> the
>> same during the last month.  With some changes to the application that
>> captures the data in the first place, I should be able to drop the need
>> for
>> this query completely.  You have made me think a bit here and you're
>> right.
>> This is not the way to do it.
>>
>> I'll rethink this a bit more and come up with something better.
>>
>> PS - Started the query before my first email was even posted, it's still
>> running... 3948 Seconds the last time I checked...
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>
>> No virus found in this incoming message.
>> Checked by AVG - www.avg.com Version: 8.5.437 / Virus Database:
>> 271.1.1/2859 - Release Date: 05/07/10 06:26:00
>>
>>
>>
>


-- 
-----------------------------
Johnny Withers
601.209.4985
johnny@stripped

Thread
Where to index - over 15m records and growingChris Knipe7 May
  • Re: Where to index - over 15m records and growingAnirudh Sundar7 May
    • Re: Where to index - over 15m records and growingChris Knipe7 May
      • Re: Where to index - over 15m records and growingRob Wultsch7 May
        • Re: Where to index - over 15m records and growingChris Knipe7 May
          • Re: Where to index - over 15m records and growingPeter Brawley7 May
            • Re: Where to index - over 15m records and growingJohnny Withers7 May
      • Re: Where to index - over 15m records and growingAnirudh Sundar8 May