From: Jan Steinman Date: September 20 2012 3:59am Subject: Re: Making Myself Crazy List-Archive: http://lists.mysql.com/mysql/228206 Message-Id: <377BFE80-9550-4A50-9FC9-A12CFF7ACFA4@bytesmiths.com> MIME-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Thanks for your help, Rick! Interspersed are some questions and rationales for you to shoot down... = :-) > From: Rick James >=20 > s_product_sales_log has no PRIMARY KEY. All InnoDB tables 'should' = have an explicit PK. This table really has no identifying information. There could be two = identical, valid rows, if the same person sold the same amount of the = same product to the same other person on the same day. All the foreign keys were indexed. Is there something I don't understand = about something a PK field does? If an individual record cannot be = uniquely identified by its information, is there really any need for a = primary key? None the less, I added field "ID" as an unsigned autoincrement INT and = made it PK. > INT(5) is not what you think. INT is always a 32-bit, 4-byte = quantity, regardless of the number. >=20 > Use TINYINT UNSIGNED, SMALLINT UNSIGNED, etc. wherever reasonable. Understood. I make all my keys UINT even when they could be smaller, = because I've gotten into some gnarly consistency problems. with foreign = keys. >> KEY `is_value_added` (`is_value_added`), > A single-column INDEX on a flag, ENUM, and other low-cardinality = field, is almost always useless. Why is that? Surely, even a flag separates the record space into two? > Performance issues... >> WHERE YEAR(sales.`Date`) =3D '{{{1}}}' > won't use >> KEY `Date` (`Date`), > because the column (Date) is hidden in a function. A workaround: > WHERE `Date` >=3D '{{{1}}}-01-01' > AND `Date` < '{{{1}}}-01-01' + INTERVAL 1 YEAR Thanks! Good catch. > JOINing two subqueries -- There is no way to index either of them, so = the JOIN will have to do a table scan of one temp table for every row of = the other temp table. > (The alternative is to CREATE TEMPORARY TABLE... with an index, for = one of the subqueries.) But I made sure the subqueries were the smallest possible sets -- = essentially, the domain of s_profit_centre, which only has 12 records. I had the entire thing coded up into one massive JOIN of everything, and = it took 30 minutes to run! By LEFT JOINing down to a dozen or fewer = records, it seems to run in reasonable time, even though it's two = subqueries that are not indexed. > It would probably be better to move the mt.tot!=3D0 test inside: >=20 > GROUP BY `Profit Centre` > ) mt ON mt.pcid =3D tt.pcid > WHERE mt.tot !=3D 0 > ) xx > --> > GROUP BY `Profit Centre` > HAVING tot !=3D 0 -- added > ) mt ON mt.pcid =3D tt.pcid > -- removed: WHERE mt.tot !=3D 0 > ) xx >=20 > That would make mt have fewer rows, hence that unindexed JOIN could = run faster. My first attempt to do that produced an error. And again, both the = subqueries will have 12 or fewer records, so I'm wondering if this = really helps anything. Thanks for your help! [clip] ---------------- :::: People see what they have been conditioned to see; they refuse to = see what they don't expect to see. -- Merle P. Martin :::: Jan Steinman, EcoReality Co-op ::::