List:General Discussion« Previous MessageNext Message »
From:hsv Date:August 22 2012 2:07am
Subject:Re: help with correlated subquery
View as plain text  
>>>> 2012/08/21 16:35 -0600, Larry Martell >>>>
I am trying to write a query that selects from both a correlated
subquery and a table in the main query, and I'm having a lot of
trouble getting the proper row count. I'm sure this is very simple,
and I'm just missing it. I'll try and present a simple example. For
this example, there are 27 rows, organized like this:


mysql> select count(*), target_name_id, ep, wafer_id from data_cst
where target_name_id = 44 group by target_name_id, ep, wafer_id;
+----------+----------------+------+----------+
| count(*) | target_name_id | ep   | wafer_id |
+----------+----------------+------+----------+
|        6 |             44 | 1,1  |       16 |
|        3 |             44 | 1,1  |       17 |
|        6 |             44 | 1,2  |       16 |
|        3 |             44 | 1,2  |       17 |
|        6 |             44 | 1,3  |       16 |
|        3 |             44 | 1,3  |       17 |
+----------+----------------+------+----------+
6 rows in set (0.00 sec)

I need to get an average of a column grouped by target_name_id, ep as
well as the average of the averages grouped by target_name_id, ep,
wafer_id, and I also need the count of the rows in the target_name_id,
ep group. My query is getting the correct averages, but incorrect row
counts:

mysql> select count(*), target_name_id, ep, avg(bottom), avg(averages)
from (select avg(bottom) as averages, target_name_id as t, ep as e
from data_cst where target_name_id = 44 group by target_name_id, ep,
wafer_id) x, data_cst where target_name_id = t and ep = e group by
target_name_id, ep;
+----------+----------------+------+-------------+-----------------+
| count(*) | target_name_id | ep   | avg(bottom) | avg(averages)   |
+----------+----------------+------+-------------+-----------------+
|       18 |             44 | 1,1  |  21.8056667 |  21.85458330000 |
|       18 |             44 | 1,2  | 121.7984444 | 121.83983335000 |
|       18 |             44 | 1,3  | 349.7634444 | 349.75016665000 |
+----------+----------------+------+-------------+-----------------+
3 rows in set (0.01 sec)

The count for each row should be 9. What do I need in my count() to be
counting the right thing?
<<<<<<<<
Your trouble lys in the joining; in effect, you are joining a row with wafer_id 16 with a
row with wafer_id 17, and also a row with wafer_id 17 with a row with wafer_id 16.

A further advantage to using the now standard form of joining, as Rick James bids you do,
is that one can add further conditions to it:

select count(*), target_name_id, ep, avg(bottom), avg(averages)
from (select avg(bottom) as averages, target_name_id, ep
	from data_cst
	where target_name_id = 44
	group by target_name_id, ep, wafer_id) x
JOIN data_cst
ON target_name_id = x.target_name_id and ep = x.ep and wafer_id < x.wafer_id
group by target_name_id, ep

The inequality, maybe, will give you what you want.

Thread
help with correlated subqueryLarry Martell21 Aug
  • RE: help with correlated subqueryRick James21 Aug
    • Re: help with correlated subqueryLarry Martell22 Aug
  • Re: help with correlated subqueryhsv22 Aug
    • Re: help with correlated subqueryLarry Martell22 Aug
      • RE: help with correlated subqueryRick James22 Aug
      • Re: help with correlated subqueryLarry Martell23 Aug
        • Re: help with correlated subqueryLarry Martell23 Aug
Re: help with correlated subqueryLarry Martell22 Aug
RE: help with correlated subqueryMartin Gainty22 Aug
  • Re: help with correlated subqueryShawn Green22 Aug
    • Re: JOIN, JOIN, JOINhsv23 Aug