List:General Discussion« Previous MessageNext Message »
From:Peter Brawley Date:March 28 2006 9:07pm
Subject:Re: Customer Recommendation Query
View as plain text  
Brian,

>Let's say a member completes action 'abc'. We want to query the
>member_actions table for all members that have also completed action 'abc'.
>We then want to determine what the top 3 other actions are that were
>completed by members who have completed action 'abc', while making sure that
>only actions that have not already been completed by this member are
>considered.

Ordered groupwise quotas. For a subquery-free two-table example see 
http://www.artfulsoftware.com/queries.php#18.

PB

-----

Brian Erickson wrote:
> We are looking for some help with queries that will accomplish a similar
> feature to what Amazon does. When you purchase a product, Amazon looks at
> all other people who have purchased that product, and then looks at all of
> the OTHER products those people have purchased, and uses that data to
> suggest related products to you. That's essentially what we are trying to
> do.
>
> We have 3 tables: members, actions, and member_actions. The 'members' table
> tracks all of our customers, the 'actions' table tracks all of the different
> actions each member can complete, and the 'member_actions' table is the weak
> entity link that tracks which actions each member have completed.
>
> Let's say a member completes action 'abc'. We want to query the
> member_actions table for all members that have also completed action 'abc'.
> We then want to determine what the top 3 other actions are that were
> completed by members who have completed action 'abc', while making sure that
> only actions that have not already been completed by this member are
> considered.
>
> We are using MySQL version 3.23. There are approximately 500 unique rows in
> the 'actions' table and 2,000,000 rows in the member_actions table, with
> 3,000+ actions being recorded at any given time.
>
> Is it possible to achieve this functionality with one/few queries? The
> statistics above may be important because if a query takes too long to
> execute, the server may not be physically capable of executing that query
> 3,000+ times simultaneously.
>
> Another option we have considered is to create a separate table called
> 'correlation' with two fields: action and correlated_action. We would then
> populate this table in a batch process following the pseudocode below.
>
> SELECT DISTINCT(action) FROM member_actions Loop
>     SELECT DISTINCT(member) FROM member_actions WHERE action = x
>     Loop
>         SELECT DISTINCT(action) FROM member_actions WHERE member = y AND
> action <> x
>         Loop
>             INSERT INTO correlation (action, correlated_action) VALUES (x,
> z)
>
> Then we could easily query this table to find correlated actions like so:
>
> SELECT DISTINCT(correlated_action), COUNT(*) AS count FROM correlation WHERE
> action = x GROUP BY correlated_action ORDER BY count DESC
>
> This would not solve the issue of only returning actions that the member had
> not already completed, but that could probably be accomplished by simply
> joining the correlation table back to the member_actions table.
>
> So, our question is whether or not this is feasible with a one/few query
> approach, or if this is something that should be accomplished with something
> similar to the approach above? Can anyone provide a good start for us?
>
>   
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.385 / Virus Database: 268.3.2/294 - Release Date: 3/27/2006
>   

Attachment: [text/html]
No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.3.2/294 - Release Date: 3/27/2006
Thread
Customer Recommendation QueryBrian Erickson28 Mar
  • Re: Customer Recommendation Querymysql28 Mar
  • Re: Customer Recommendation QueryPeter Brawley28 Mar
  • Re: Customer Recommendation QueryGabriel PREDA29 Mar