List:General Discussion« Previous MessageNext Message »
From:Jan Steinman Date:September 20 2012 3:59am
Subject:Re: Making Myself Crazy
View as plain text  
Thanks for your help, Rick!

Interspersed are some questions and rationales for you to shoot down... :-)

> From: Rick James <rjames@stripped>
> 
> s_product_sales_log has no PRIMARY KEY.  All InnoDB tables 'should' have an explicit
> PK.

This table really has no identifying information. There could be two identical, valid
rows, if the same person sold the same amount of the same product to the same other
person on the same day.

All the foreign keys were indexed. Is there something I don't understand about something a
PK field does? If an individual record cannot be uniquely identified by its information,
is there really any need for a primary key?

None the less, I added field "ID" as an unsigned autoincrement INT and made it PK.

> INT(5) is not what you think.  INT is always a 32-bit, 4-byte quantity, regardless of
> the number.

> 
> Use TINYINT UNSIGNED, SMALLINT UNSIGNED, etc. wherever reasonable.

Understood. I make all my keys UINT even when they could be smaller, because I've gotten
into some gnarly consistency problems. with foreign keys.

>> KEY `is_value_added` (`is_value_added`),
> A single-column INDEX on a flag, ENUM, and other low-cardinality field, is almost
> always useless.

Why is that? Surely, even a flag separates the record space into two?

> Performance issues...
>> WHERE  YEAR(sales.`Date`) = '{{{1}}}'
> won't use
>> KEY `Date` (`Date`),
> because the column (Date) is hidden in a function.  A workaround:
>   WHERE `Date` >= '{{{1}}}-01-01'
>     AND `Date` <  '{{{1}}}-01-01' + INTERVAL 1 YEAR

Thanks! Good catch.

> JOINing two subqueries -- There is no way to index either of them, so the JOIN will
> have to do a table scan of one temp table for every row of the other temp table.
> (The alternative is to CREATE TEMPORARY TABLE... with an index, for one of the
> subqueries.)

But I made sure the subqueries were the smallest possible sets -- essentially, the domain
of s_profit_centre, which only has 12 records.

I had the entire thing coded up into one massive JOIN of everything, and it took 30
minutes to run! By LEFT JOINing down to a dozen or fewer records, it seems to run in
reasonable time, even though it's two subqueries that are not indexed.

> It would probably be better to move the mt.tot!=0 test inside:
> 
>                   GROUP BY  `Profit Centre`
>           ) mt ON mt.pcid = tt.pcid
>           WHERE  mt.tot != 0
>   ) xx
> -->
>                   GROUP BY  `Profit Centre`
>                   HAVING tot != 0                   -- added
>           ) mt ON mt.pcid = tt.pcid
>           -- removed:   WHERE  mt.tot != 0
>   ) xx
> 
> That would make mt have fewer rows, hence that unindexed JOIN could run faster.

My first attempt to do that produced an error. And again, both the subqueries will have 12
or fewer records, so I'm wondering if this really helps anything.

Thanks for your help!

[clip]

----------------
:::: People see what they have been conditioned to see; they refuse to see what they don't
expect to see. -- Merle P. Martin
:::: Jan Steinman, EcoReality Co-op ::::




Thread
Re: Making Myself CrazyJan Steinman20 Sep
  • RE: Making Myself CrazyRick James20 Sep