From: Rick James Date: September 19 2012 9:19pm Subject: RE: Making myself crazy... List-Archive: http://lists.mysql.com/mysql/228202 Message-Id: <2E7DD7ADE53B044C8C8BCD9C5829E1EB148BB5E04F@SP2-EX07VS01.ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Other comments: s_product_sales_log has no PRIMARY KEY. All InnoDB tables 'should' have an= explicit PK. INT(5) is not what you think. INT is always a 32-bit, 4-byte quantity, reg= ardless of the number. Use TINYINT UNSIGNED, SMALLINT UNSIGNED, etc. wherever reasonable. > KEY `is_value_added` (`is_value_added`), A single-column INDEX on a flag, ENUM, and other low-cardinality field, is = almost always useless. Slightly more readable: SELECT pc AS `Profit Centre`, CONCAT('$', FORMAT(prp, 2)) AS Paid, CONCAT('$', FORMAT(prt, 2)) AS Trade, CONCAT('$', FORMAT(prtot, 2)) AS `Both` FROM ( SELECT tt.`Profit Centre` AS pc, mt.Paid / tt.Hours AS prp, mt.Trade / tt.Hours AS prt, mt.tot / tt.Hours AS prtot FROM ( SELECT SUM(TIMESTAMPDIFF(MINUTE, tl.`in`, tl.`out`)/60) = AS Hours, pc.ID AS pcid, pc.name AS `Profit Centre` FROM s_timelog tl LEFT JOIN s_project proj ON tl.project_id =3D proj.id LEFT JOIN s_profit_centre pc ON proj.profit_centre =3D= pc.ID WHERE proj.private =3D 'no' AND YEAR(tl.`in`) =3D '{{{1}}}' GROUP BY `Profit Centre`) tt LEFT JOIN ( SELECT SUM(case when sales.Type NOT IN('barter', 'work t= rade') then `Total` else 0 end) AS Paid, SUM(case when sales.Type IN('barter', 'work trade')= then `Total` else 0 end) AS Trade, SUM(`Total`) AS tot, pc.ID AS pcid, pc.name AS `Pro= fit Centre` FROM s_product_sales_log sales LEFT JOIN s_product prod ON sales.Product =3D prod.ID LEFT JOIN s_profit_centre pc ON prod.profit_centre =3D= pc.ID WHERE YEAR(sales.`Date`) =3D '{{{1}}}' GROUP BY `Profit Centre` ) mt ON mt.pcid =3D tt.pcid WHERE mt.tot !=3D 0 ) xx Performance issues... > WHERE YEAR(sales.`Date`) =3D '{{{1}}}' won't use > KEY `Date` (`Date`), because the column (Date) is hidden in a function. A workaround: WHERE `Date` >=3D '{{{1}}}-01-01' AND `Date` < '{{{1}}}-01-01' + INTERVAL 1 YEAR JOINing two subqueries -- There is no way to index either of them, so the J= OIN will have to do a table scan of one temp table for every row of the oth= er temp table. (The alternative is to CREATE TEMPORARY TABLE... with an index, for one of = the subqueries.) It would probably be better to move the mt.tot!=3D0 test inside: GROUP BY `Profit Centre` ) mt ON mt.pcid =3D tt.pcid WHERE mt.tot !=3D 0 ) xx --> GROUP BY `Profit Centre` HAVING tot !=3D 0 -- added ) mt ON mt.pcid =3D tt.pcid -- removed: WHERE mt.tot !=3D 0 ) xx That would make mt have fewer rows, hence that unindexed JOIN could run fas= ter. > -----Original Message----- > From: Jan Steinman [mailto:Jan@stripped] > Sent: Tuesday, September 18, 2012 9:26 PM > To: Rick James > Cc: mysql@stripped > Subject: Re: Making myself crazy... > > Thanks for your help, Rick. > > The solution that seems to be working was to have both the FROM and > LEFT JOIN be grouped subqueries. I realize this is probably not the > best way of doing things, but it seems fast enough at present. > > Here is the query, followed by the table definitions, resulting in the > following data: > http://www.EcoReality.org/wiki/2012_gross_productivity > > -- (The following query is from a template, in which "{{{1}}}" is > replaced by a year, such as "2012".) SELECT > pc AS `Profit Centre`, > CONCAT('$', FORMAT(prp, 2)) AS Paid, > CONCAT('$', FORMAT(prt, 2)) AS Trade, > CONCAT('$', FORMAT(prtot, 2)) AS `Both` FROM (SELECT > tt.`Profit Centre` AS pc, > mt.Paid / tt.Hours AS prp, > mt.Trade / tt.Hours AS prt, > mt.tot / tt.Hours AS prtot > FROM > (SELECT > SUM(TIMESTAMPDIFF(MINUTE, tl.`in`, tl.`out`)/60) AS Hours, > pc.ID AS pcid, > pc.name AS `Profit Centre` > FROM s_timelog tl > LEFT JOIN > s_project proj ON tl.project_id =3D proj.id LEFT JOIN > s_profit_centre pc ON proj.profit_centre =3D pc.ID WHERE proj.private = =3D > 'no' AND YEAR(tl.`in`) =3D '{{{1}}}' > GROUP BY `Profit Centre`) tt > LEFT JOIN > (SELECT > SUM(case when sales.Type NOT IN('barter', 'work trade') then > `Total` else 0 end) AS Paid, > SUM(case when sales.Type IN('barter', 'work trade') then `Total` > else 0 end) AS Trade, > SUM(`Total`) AS tot, > pc.ID AS pcid, > pc.name AS `Profit Centre` > FROM s_product_sales_log sales > LEFT JOIN > s_product prod ON sales.Product =3D prod.ID > LEFT JOIN > s_profit_centre pc ON prod.profit_centre =3D pc.ID > WHERE YEAR(sales.`Date`) =3D '{{{1}}}' > GROUP BY `Profit Centre`) mt > ON mt.pcid =3D tt.pcid > WHERE mt.tot !=3D 0) xx > > --"s_timelog" and "s_product_sales_log" are the two "ends" of the > query, each related by an intermediate table to the "centre" of the > query, "s_profit_centre": > -- s_timelog.project_id --> s_project.id, s_project.profit_centre --> > s_profit_centre.ID > -- s_product_sales_log.Product --> s_product.ID, > s_product.profit_centre --> s_profit_centre.ID > > ---------------- > > -- 18,739 rows, growing by ~100 records monthly CREATE TABLE > `s_timelog` ( > `id` int(10) unsigned NOT NULL auto_increment, > `mw_user_id` int(11) unsigned NOT NULL default '2', > `in` datetime NOT NULL, > `out` datetime NOT NULL, > `project_id` int(11) unsigned NOT NULL, > `activity_id` int(11) unsigned NOT NULL, > `accomplishment` varchar(255) character set utf8 NOT NULL, > `Paid` enum('yes','no') character set utf8 NOT NULL default 'no' > COMMENT 'hours that are compensated', > `Class_B_qualified` enum('yes','no') character set utf8 NOT NULL > default 'no', > `updated` timestamp NULL default CURRENT_TIMESTAMP on update > CURRENT_TIMESTAMP, > PRIMARY KEY (`id`), > KEY `mw_user_id` (`mw_user_id`), > KEY `in` (`in`), > KEY `project_id` (`project_id`), > KEY `activity_id` (`activity_id`), > KEY `out` (`out`) > ) ENGINE=3DInnoDB DEFAULT CHARSET=3Dutf8 COLLATE=3Dutf8_bin > COMMENT=3D'EcoReality time log.' AUTO_INCREMENT=3D19462 ; > -- > -- RELATIONS FOR TABLE `s_timelog`: > -- `user_id` > -- `mw_user` -> `user_id` > -- `mw_user_id` > -- `mw_user` -> `user_id` > -- `project_id` > -- `s_project` -> `id` > -- `activity_id` > -- `s_activity` -> `id` > > ---------------- > > -- 143 rows, growing by one or two a month CREATE TABLE IF NOT EXISTS > `s_project` ( > `id` int(6) unsigned NOT NULL auto_increment, > `profit_centre` int(10) unsigned NOT NULL default '0', > `profit_centre_amortized` int(3) unsigned default '0' COMMENT 'for > capital projects', > `name` varchar(62) NOT NULL, > `URL` varchar(255) default NULL, > `need` varchar(255) NOT NULL, > `description` text NOT NULL, > `super_id` int(11) unsigned NOT NULL default '0', > `type` enum('capital','operations') NOT NULL default 'operations', > `duration` enum('fixed','ongoing') NOT NULL default 'fixed', > `status` enum('abandoned','agreed','cancelled','completed','on > hold','ongoing','requested') NOT NULL default 'requested', > `start` date NOT NULL, > `end` date NOT NULL, > `completed` date NOT NULL, > `depends_on_id` int(6) unsigned NOT NULL default '0', > `budget_expense` decimal(7,2) NOT NULL, > `budget_effort` decimal(7,2) NOT NULL, > `effort_units` enum('minutes','hours','days','weeks','months') NOT > NULL default 'hours', > `steward_id` int(5) unsigned NOT NULL default '0', > `private` enum('yes','no') NOT NULL default 'no', > `updated` timestamp NULL default CURRENT_TIMESTAMP on update > CURRENT_TIMESTAMP, > `notes` text NOT NULL, > PRIMARY KEY (`id`), > UNIQUE KEY `name` (`name`), > KEY `duration` (`duration`), > KEY `status` (`status`), > KEY `super_id` (`super_id`,`depends_on_id`,`steward_id`), > KEY `depends_on_id` (`depends_on_id`,`steward_id`), > KEY `steward_id` (`steward_id`), > KEY `profit_centre` (`profit_centre`), > KEY `profit_centre_amortized` (`profit_centre_amortized`) > ) ENGINE=3DInnoDB DEFAULT CHARSET=3Dutf8 COMMENT=3D'EcoReality (and othe= r) > projects that require resources.' AUTO_INCREMENT=3D145 ; > > -- > -- RELATIONS FOR TABLE `s_project`: > -- `steward` > -- `mw_user` -> `user_id` > -- `super_id` > -- `s_project` -> `id` > -- `depends_on_id` > -- `s_project` -> `id` > -- `steward_id` > -- `mw_user` -> `user_id` > -- `profit_centre` > -- `s_profit_centre` -> `ID` > -- `profit_centre_amortized` > -- `s_profit_centre` -> `ID` > > ---------------- > > -- 12 records, growing by one or two a year CREATE TABLE IF NOT EXISTS > `s_profit_centre` ( > `ID` int(3) unsigned NOT NULL, > `name` varchar(255) NOT NULL, > `description` text NOT NULL, > `steward` int(5) unsigned NOT NULL, > `updated` timestamp NOT NULL default CURRENT_TIMESTAMP on update > CURRENT_TIMESTAMP, > `notes` text NOT NULL, > PRIMARY KEY (`ID`), > KEY `steward` (`steward`) > ) ENGINE=3DInnoDB DEFAULT CHARSET=3Dutf8; > > -- > -- RELATIONS FOR TABLE `s_profit_centre`: > -- `steward` > -- `mw_user` -> `user_id` > > > ---------------- > > -- 3,090 rows, growing by a few hundred monthly CREATE TABLE > `s_product_sales_log` ( > `Venue` enum('EcoReality grounds','farm gate','delivered','Tuesday > market','roadside','Saturday market','store','subscription','Yellow > House kitchen') character set utf8 NOT NULL default 'farm gate', > `Seller` int(11) unsigned NOT NULL default '212', > `Who` int(11) unsigned NOT NULL default '0', > `Date` date NOT NULL, > `Quantity` decimal(8,3) NOT NULL, > `Unit` > enum('bale','box','bunch','dozen','each','grams','kilograms','liters',' > ounces','pints','pounds','quarts') character set utf8 collate utf8_bin > NOT NULL default 'each', > `Total` decimal(8,2) NOT NULL, > `Type` enum('barter','cash','check','invoice','PayPal','work trade') > character set utf8 NOT NULL default 'cash', > `Product` int(11) unsigned NOT NULL, > KEY `Product` (`Product`), > KEY `Date` (`Date`), > KEY `Seller` (`Seller`), > KEY `Who_2` (`Who`) > ) ENGINE=3DInnoDB DEFAULT CHARSET=3Dlatin1; > > -- > -- RELATIONS FOR TABLE `s_product_sales_log`: > -- `Seller` > -- `mw_user` -> `user_id` > -- `Who` > -- `mw_user` -> `user_id` > -- `Product` > -- `s_product` -> `ID` > > ---------------- > > -- 186 records, growing by ~1 monthly > CREATE TABLE IF NOT EXISTS `s_product` ( > `ID` int(10) unsigned NOT NULL auto_increment, > `super` int(11) unsigned default NULL COMMENT 'generalization', > `name` varchar(31) character set utf8 NOT NULL, > `units` > enum('kilograms','grams','pounds','ounces','liters','each','cords','bal > es') character set utf8 NOT NULL default 'kilograms' COMMENT 'preferred > unit', > `profit_centre` int(3) unsigned NOT NULL default '3', > `tax_qualified` enum('yes','no') character set utf8 NOT NULL default > 'yes', > `tax_livestock_born` enum('yes','no') character set utf8 NOT NULL > default 'no', > `tax_poultry_egg` enum('yes','no') character set utf8 NOT NULL > default 'no', > `is_value_added` enum('yes','no') character set utf8 NOT NULL default > 'no', > `plant_ID` int(5) unsigned default NULL, > `description` varchar(255) character set utf8 NOT NULL, > PRIMARY KEY (`ID`), > UNIQUE KEY `Name` (`name`), > KEY `Description` (`description`), > KEY `is_value_added` (`is_value_added`), > KEY `profit_centre` (`profit_centre`), > KEY `super` (`super`), > KEY `plant_ID` (`plant_ID`) > ) ENGINE=3DInnoDB DEFAULT CHARSET=3Dutf8 COLLATE=3Dutf8_bin COMMENT=3D'l= ist of > EcoReality farm products' AUTO_INCREMENT=3D186 ; > > -- > -- RELATIONS FOR TABLE `s_product`: > -- `super` > -- `s_product` -> `ID` > -- `profit_centre` > -- `s_profit_centre` -> `ID` > -- `plant_ID` > -- `s_plants` -> `ID` > > > On 17 Sep 12, at 16:12, Rick James wrote: > > > If the subquery-version is not too slow, live with it. > > > > If necessary, make your non-grouped SELECT a subquery and apply GROUP > BY outside. Thus: > > > > SELECT ... FROM ( SELECT non-grouped... ) GROUP BY ... > > > > Could you provide that; we might be able to simplify it. > > Also provide SHOW CREATE TABLE for each table. > > How many rows in each table? (approx) > > > >> -----Original Message----- > >> From: Jan Steinman [mailto:Jan@stripped] > >> Sent: Sunday, September 16, 2012 3:45 PM > >> To: mysql@stripped > >> Subject: Making myself crazy... > >> > >> I'm having trouble figuring out how to make a query. It seems > simple, > >> but it's making me crazy right now. Please point out where my > >> thinking is addled... > >> > >> I have the following (simplified) table structure: > >> > >> s.timelog --> s.projects --> s.profit_centres > >> > >> s.product.sales --> s.products --> s.profit_centres > >> > >> (The arrows refer to many-to-one relationships: each Timelog record > >> refers to a single Project, which is in a single Profit Centre. Each > >> record has an opaque ID referred to by records in the table to its > >> left.) > >> > >> What I want to do is figure out productivity: sales per hour worked > >> per Profit Centre. > >> > >> I can do this at a gross level -- without grouping by Profit Centres > >> -- with a subquery: simply sum up the Lales and divide by the sum of > >> the labour (Timelog.out - Timelog.in). But I suspect even this can > be > >> done without a subquery. > >> > >> But needing to do two levels of indirection has stymied me: how can > I > >> group $/hour by Profit Centre? > >> > >> You can see the SQL here if you wish: > >> http://www.ecoreality.org/wiki/Template:Annual_gross_productivity > >> _for > >> > >> Then click on the "Source" link to see the code. (You won't be able > >> to change it without logging in.) > >> > >> I'm using CASE to pivot tables to separate out monetary versus > >> barter/trade income. > >> > >> Thanks for any help offered! > >> > >> ---------------- > >> :::: Respond with love and compassion, rather than punishment and > >> retaliation, and an angry person will be disarmed. Trade the need to > >> be right for a loving relationship. It is a trade you - and everyone > >> around you - will benefit from. -- Dean Van Leuven > >> :::: Jan Steinman, EcoReality Co-op :::: > >> > >> > >> > >> > >> > >> -- > >> MySQL General Mailing List > >> For list archives: http://lists.mysql.com/mysql > >> To unsubscribe: http://lists.mysql.com/mysql > > > > ---------------- > :::: She can't sing in more languages than most people can't speak in. > -- JWS > :::: Jan Steinman, EcoReality Co-op :::: > > >