MySQL Lists are EOL. Please join:

List:General Discussion« Previous MessageNext Message »
From:AM Thomas Date:February 16 2005 10:20pm
Subject:Re: select where multiple joined records match
View as plain text  
Hi there,

I'll answer your questions below, but I should tell you that it looks like  
even three or four "joins" seem to make this too slow.  Theoretically, the  
user could select up to 15 or so GoalNumber values, 12 for Grade, and  
about 20 possibilities for Subject - clearly it wouldn't be useful to  
specify that many items in a search, but that's the theoretical maximum; 4  
or 5 values isn't unreasonable, though.

Four "AND"ed goal numbers plus a subject and a grade slowed the search (on  
the shared commercial web host I'm using) into the 3 minute range, and  
that's with a regular join, not a left join.  This is the SELECT that took  
about 3 minutes (3 trials, simplifying slightly each time, simplest given  
here):

select r.id from resources as r
   join resource_goals as g0 on (r.id=g0.ResourceID)
   join resource_goals as g1 on (r.id=g1.ResourceID and g1.GoalNumber=1)
   join resource_goals as g2 on (r.id=g2.ResourceID and g2.GoalNumber=2)
   join resource_goals as g3 on (r.id=g3.ResourceID and g3.GoalNumber=3)
   join resource_goals as g4 on (r.id=g4.ResourceID and g4.GoalNumber=4)
where ((g0.Subject='Social_Studies')
   and (g0.Grade='4th'))
group by r.id;

The fastest time was 2 min 48 sec.
Last time (simplest query) was 3 min 2 sec.


I'm really running out of time on this project, so I just went ahead and  
made the user interface such that users can only select one subject,  
grade, and/or goal number at a time.  It's probably a sound decision from  
a usability perspective, so I'm not too sad.  If I decide to make this  
work in the future, I'd probably have to just do a SELECT for each "AND"ed  
field, get the list of resource id's for each SELECT, then find the  
intersection of the lists in Perl.

If I could speed this up with some kind of indexing, I'd love to know  
about it.

The GROUP BY phrase is because I wanted just one row per resource.  It  
seemed like I'd get a row for each condition/resource (didn't test it with  
the final ).  I'm actually doing SELECT * FROM... in my code, and not  
using the resource_goals information in my output (that's a separate view  
at present, generated by different Perl code).

It seems to work fine without the NOT NULL parts, you're right.  I was  
wondering about that, but was sleepy enough at the time that I didn't  
trust my thinking.

Sorry about not including my table defs :-(  .  I guess I was just hoping  
for a general approach, and didn't realize that anyone would be interested  
enough to read all that detail and provide and exact solution for me.  Of  
course, now I realize that it would have simplified our discussion.   
Anyway, late but not never, and for help to whoever finds this in the list  
archives someday, here are my table defs (you're right - the goal number  
is a TINYINT):

(Below is an abridged version of the resources table ; it also contains  
about 60 more TINYINT fields which are essentially used as booleans, some  
of which I hope to eliminate.  Yes, I could have used SET or something,  
but I didn't for various reasons.)

CREATE TABLE resources (
   id INT UNSIGNED PRIMARY KEY,
   Title TEXT,
   ResourceType_THJHArticle TINYINT,
   ResourceType_NIEArticle TINYINT,
   DataEntryName TINYTEXT,
   Date DATETIME,
   Notes TEXT,
   Made_Keywords TEXT);

CREATE TABLE  resource_goals (
   goal_id INT UNSIGNED PRIMARY KEY,
   ResourceID INT,
   Grade TINYTEXT,
   Subject TINYTEXT,
   GoalNumber TINYINT,
   NumericGrade TINYINT);


Thanks a bunch for your help; I'm finding this more interesting than I  
thought I would.




On Wed, 16 Feb 2005 11:08:20 -0500, Michael Stassen  
<Michael.Stassen@stripped> wrote:

>
> AM Thomas wrote:
>
>> Guarded exclamations of success!
>>  This seems to be working for me so far - thank you!  Here's an actual   
>> example of a working query on my database; the field names are a  
>> little  different (sorry, I was trying to make my earlier example more  
>> readable).   The main change, though, is that I did plain 'join'  
>> instead of 'left  join', which seems to make it much faster.  It was  
>> pretty slow at first.
>
> Yes, LEFT JOIN does extra work, and it wasn't needed here.
>
>> mysql> select r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber from  
>> resources  as r
>>        join resource_goals as ga on r.id=ga.ResourceID
>>        and ga.Grade='4th' and ga.GoalNumber='1'
>>        join resource_goals as gb on r.id=gb.ResourceID
>>        and gb.Grade='4th' and gb.GoalNumber='2'
>>        where ga.goal_id IS NOT NULL
>>        and gb.goal_id IS NOT NULL
>>        group by r.id;
>
> There are some strange things here, I think.
>
> * You've never shown us your table definitions, but I would have  
> expected GoalNumber to be an integer, not a string.  If so, you  
> shouldn't quote the numbers you compare it to.
>
> * I think it unlikely that you have rows with values in Grade,  
> GoalNumber, and ResourceID which have NULL for goal_id.  Isn't goal_id  
> the primary key?   If I'm right, you don't need your WHERE clause.
>
> * Why have you added "GROUP BY r.id"?  Was that an attempt to fix  
> something?   If we've got the query right, there should be no need for  
> grouping.  If you don't get the result you want without the GROUP BY,  
> then you should let us know, because that would mean we've missed  
> something.
>
> Putting those together, I'd expect
>
>    SELECT r.id, ga.Grade, ga.GoalNumber, gb.GoalNumber
>    FROM resources AS r
>    JOIN resource_goals AS ga
>       ON r.id = ga.ResourceID
>      AND ga.Grade = '4th'
>      AND ga.GoalNumber = 1
>    JOIN resource_goals AS gb
>       ON r.id = gb.ResourceID
>      AND gb.Grade = '4th'
>      AND gb.GoalNumber = 2
>
> to do the job.  Does it?
>
> Michael
>
>



-- 
Virtue of the Small / (919) 929-8687
Thread
select where multiple joined records matchAM Thomas13 Feb
  • Re: select where multiple joined records matchPeter Brawley13 Feb
    • Re: select where multiple joined records matchMichael Stassen13 Feb
  • Re: select where multiple joined records matchMichael Stassen13 Feb
    • more complexity (was: select where multiple joined records match)AM Thomas14 Feb
  • RE: select where multiple joined records matchGordon14 Feb
    • Re: select where multiple joined records matchAM Thomas14 Feb
      • Re: select where multiple joined records matchJeremy Cole14 Feb
        • Re: select where multiple joined records matchAM Thomas15 Feb
          • Re: select where multiple joined records matchMichael Stassen16 Feb
            • Re: select where multiple joined records matchAM Thomas16 Feb
              • Re: select where multiple joined records matchMathew Ray17 Feb
              • Re: select where multiple joined records matchMichael Stassen21 Feb
                • Re: select where multiple joined records matchAM Thomas21 Feb
        • Re: select where multiple joined records matchMichael Stassen16 Feb
Re: more complexity (was: select where multiple joined records match)Harald Fuchs14 Feb