Oh, that's an entirely different problem than the one I understood from your
Your new explanation is clearer but I'm still not sure I follow all of it.
The formatting/wrapping of the example makes things a bit fuzzy too. Am I
right in assuming that on the first row of the result, the OperatorID is
2661, the QAID is 5334 and the NTID is JulieAnt? If so, is JulieAnt the
reviewer? Also are the QAID and the NTID different ways of saying the same
thing, such as the fact that JulieAnt's employee number is 5334, or is it
just a coincidence that they are the same on both rows?
I'm assuming that a Batch has two different operators if one shift starts
the batch but the batch isn't finished by the time the shift ends so a
second operator finishes the Batch. Will you ever have cases where a Batch
takes so long that it takes more than two shifts to finish it? For instance,
would you ever see a case where the company gets an order for a billion
grapple grommets and it takes 9 shifts to finish it? I'm trying to figure
out if you also need to worry about batches that are split over more than
Does the Batch value uniquely identify a particular Batch or is it the
combination of Batch and KeyDate?
Ok, assuming for the moment that Batch alone uniquely identifies a
particular Batch, and that you care about Batches that are split across two
*or more* shifts, you need a query like this to find those Batches:
select Batch, count(*) as count
group by Batch
having count >= 2;
That will return the Batch numbers alone.
Then, you simply do a second Select to get the other properties of the Batch
that you care about. That query would look very much like the one in your
example except that you would do it like so:
SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID
WHERE Batch in ('439584414', '123456', '999444');
In other words, you'd put a list consisting of all of the values returned
from the one that got the Batch numbers into the IN clause.
Now, that could be a little tedious if there were a lot of 'split' batches
and you may also run into an issue if the number of of values in the IN
clause exceeded MySQL's limit, whatever that is. (Or maybe MySQL has a limit
on the maximum *length* of the IN clause; if it does, that could bite you
too.) To get around that, you could just repeat the select with the IN
clause for subsets of the results from the first query or even just do your
original query once for each of the values returned by the counting query.
A much better alternative would be to use a subquery but that is only an
option if you are running MySQL 4.1 or later. That would let you combine
both queries together into a single big query. Are you running V4.1 or
later? If you are, I could take a stab at the combined query....
----- Original Message -----
From: "Scott Hamm" <linuxgold@stripped>
To: "Rhino" <rhino1@stripped>
Cc: "Mysql" <mysql@stripped>
Sent: Tuesday, April 05, 2005 11:14 AM
Subject: Re: Duplicated records
> Sorry for the confusion. I don't mean a duplicated records, but to
> find duplicated Batch where two different operators worked on a single
> batch (one started off, then another one to finish the batch) and a
> single reviewer to review a batch. I need a list that lists duplicated
> Batches with different operators that worked on that batch.
> For example: (for clarification)
> SELECT ID, Batch, KeyDate, OperatorID, QAID, NTID
> FROM QA
> WHERE Batch='439584414';
> ID Batch KeyDate OperatorID
> QAID NTID
> 90577 439584414 2004-10-03 00:00:00 2661 5334 JulieAnt
> 90575 439584414 2004-10-03 00:00:00 5657 5334 JulieAnt
> On Apr 5, 2005 10:54 AM, Rhino <rhino1@stripped> wrote:
> > ----- Original Message -----
> > From: "Scott Hamm" <linuxgold@stripped>
> > To: "Mysql" <mysql@stripped>
> > Sent: Tuesday, April 05, 2005 10:05 AM
> > Subject: Duplicated records
> > > Here is my novice question:
> > > I need to find duplicates in "Batch" category when I issued this
> > >
> > > SELECT * FROM QA WHERE Batch=439584414;
> > >
> > > Result:
> > > 90577
> > > 1 2661 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03
> > > 00:00:00 0 90575
> > > 1 5657 5334 JulieAnt 25 5 5 439584414 2004-10-03 00:00:00 2004-10-03
> > 00:00:00 0
> > >
> > > How do I issue a query that finds duplicated Batch number?
> > >
> > First of all, I think you made a mistake when putting this data into
> > note, a mistake which is compounded by the wrapping that the email
> > did. I *think* you meant to display two rows with the '90577' value at
> > end of the second row, not before the first row. This makes the example
> > bit confusing and hard to follow.
> > The two rows you provide in your example are not duplicates of each
> > Even assuming that the '90577' actually belongs at the end of the second
> > rather than the start of the first row, the two rows are different: the
> > first row ends in '90575', not '90577' and the second value in each row
> > also different: '2661' and '5657'. Therefore, you're not really trying
> > find duplicate records because a duplicate row would be one that is
> > identical in every column, not just identical in several columns. I hope
> > that doesn't sound like I am splitting hairs; maybe I am. ;-)
> > Anyway, the standard method for finding duplicates of rows goes like
> > and can be adapted for your situation. Let's say that you have a table
> > called Foo with columns foo1, foo2, and foo3 and you want to find all of
> > duplicate rows in the table, i.e. any row whose foo1, foo2, and foo3
> > is identical to the foo1, foo2, and foo3 value of another table. You
> > write this query:
> > select foo1, foo2, foo3, count(*) as num
> > from Foo1
> > group by foo1, foo2, foo3
> > having num > 1
> > order by num;
> > Your result might look like this:
> > foo1 foo2 foo3 num
> > Daffy D Duck 1
> > Bugs B Bunny 1
> > Elmer A Fudd 3
> > This result means that Daffy D Duck and Bugs B Bunny each occur once in
> > table but Elmer A Fudd occurs three times.
> > In the case you mentioned in your email, where the rows aren't complete
> > duplicates of one another, just write the query so that you omit the
> > which are allowed to be different. In your case, that would appear to be
> > but the second and last columns.
> > However, I think if you get to the point of having duplicate records in
> > table when you don't want them, finding them is like shutting the barn
> > after the horse is already gone. I think you need to reconsider your
> > specifically your primary key, and change that primary key to *PREVENT*
> > duplicate keys in the first place.
> > For instance, in the case of the Foo1 table, I can prevent duplicate
> > choosing a good primary key; in this case I would choose the combination
> > all three columns in the table; that would ensure that I never stored
> > than 1 person named Elmer A Fudd. [Actually, that's not a great example!
> > might be perfectly valid to have two people named Elmer A Fudd - or John
> > Smith - so I probably need to add additional columns to my table to
> > that the rows are unique; something like Social Security Number which is
> > (supposed to be) unique would be ideal for this purpose.]
> > Rhino
> > --
> > No virus found in this outgoing message.
> > Checked by AVG Anti-Virus.
> > Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005
> Power to people, Linux is here.
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe: http://lists.mysql.com/mysql?unsub=1
> No virus found in this incoming message.
> Checked by AVG Anti-Virus.
> Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.9.2 - Release Date: 05/04/2005