List:General Discussion« Previous MessageNext Message »
From:Rob Date:June 12 2002 5:34pm
Subject:Re: Tables in MySQL
View as plain text  
Putting the efficiency issues aside to the extent we already have, I don't
see why you can't just use a simple join table for this purpose. You do have
the minor complication of symmetry, but with a simple ordering operation
against your original insertion join even this can be overcome.
Let's write some SQL:

create table product ( id int not null primary key );
create table order ( id int not null primary key );
create table order_item ( order_id int not null, product_id int not null );
create table product_co_order ( product1_id not null,product2_id not null );

insert into table product_co_order ( product1_id, product2_id )
select item1.product_id, item2.product_id
from order_item as item1, order_item as item2
where item1.order_id = item2.order_id
and item1.product_id != item2.product_id;

select max(product2_id) from product_co_order
where product1_id=[currentProductID]
group by product2_id
order by count(*)
limit 5;

In terms of the schema you'd probably either add a quantity field to
order_item as well as a uniqueness constraint on (order_id,product_id), or
you'd turn the '' into ' distinct', since you
probably want to weight by order instead of by quantity, since quantities
aren't really comparable across product lines. (The fact that people buy
notebooks three at a time doesn't mean notebooks relate more heavily to all
your other products...)

There are some improvements that can be made to asymptotic efficiencies
here, however. Right now the product_co_order table requires vast storage,
and the final select requires a group...order...limit over a potentially
very large result set, which might not be optimized as well as it could be-
there's a good chance the DB will spend a lot of time screwing around with
rows that won't end up in your result set. (A small optimization would be to
attempt to eliminate the symmetry in the the product_co_order table by
changing the product_id equality test in the insert to a strict comparison
(less than or greater than) and then adding some garbage to the final select
to look for a product id in either field and returning the 'other' one in
the result set, but we can do a lot better than that.)

If you'd like to keep tertiary information on the order relationships (like
how much time between the orders) you can keep the product_co_order table
permanently, but if not them you can just make it a temporary table and turn
the '' into a 'create temporary'. Either way,
you also keep a table with caches the total number of co_orders, with a name
somehow better than the only one I can think of right now:

create table product_relationships ( 
  product1_id int not null,
  product2_id not null,
  number_of_orders int not null

(And add a unique index on (product1_id,product2_id))
You'd then find relationships in the current order:

create temporary table temp_relationships
select max(product1_id) as product1_id,
       max(product2_id) as product2_id,
       count(*) as number_of_orders
from product_co_order
where product1_id=[currentProductID]
group by product1_id, product2_id;

(Although, again, just setting number_of_orders to 1 might be more

Then you'd update the main table to add in these figures, which requires a
bit of sleight-of-hand in MySQL:

lock tables product_relationships;

create temporary table rows_to_replace
select temp_relationships.product1_id, temp_relationships.product2_id,
       temp_relationships.number_of_orders +
from temp_relationships left join product_relationships
on temp_relationships.product1_id = product_relationships.product1_id
and temp_relationships.product2_id = product_relationships.product2_id;

replace into product_relationships select * from rows_to_replace;

unlock tables;

and now you get the efficient "what-else-did-others-buy" query:

select product2_id from product_relationships
where product1_id=[currentProductID]
order by number_of_orders
limit 5;

In a more robust environment you could probably turn this into a simple
"update product_relationships set number_of_orders=number_of_orders+1" with
a spiffy subquery, or even updating the number_of_orders by arbitrary
amounts with co-queries, and use multiple sub-queries to put all the work in
a single mammoth SQL command that doesn't even need locking, but in MySQL
the best you can do is trim down one of two of the temporary tables.

Hope the help outweighs the added confusion.


On 12/6/02 at 5:34 pm, <webmaster@stripped> wrote:

> Hi,
> im currently developing a Webshop system. One of the new features of it
> should be a "Costumers that bought this product also bought...." feature -
> concerning this im currently unsure how to implement it - (LAMPS) - my
> current thought is following:
> Costumer A buys Products with NO: 12, 13 , 25 -> system says OK, looks for
> Tables 12,13,25 -> finds nothing creates table 12, inserts 13 and 25 and
> sets sold of each to 1, then creates table 13 and 25 and inserts like it
> in table 12
> now cosumter B buys products 13,12,19 -> system says OK, looks for tables
> 13,12,19 and finds only 12 created, adds 19 into table 12 and updates sold
> from 13 in table 12 -> then does this with table 13 and finally creates
> table 19 (because new) and inserts like in Cosumter A's way...
> now the question: is MySQL aware of being with over 1500 tables ??? - is
> there a better way or more efficent way to do the same ?
> Hope sb. knows a trick :)
> Yours Sincerely
> Korbinian Bachl
> ---------------------------------------------------------------------
> Before posting, please check:
>   (the manual)
>           (the list archive)
> To request this thread, e-mail <mysql-thread111733@stripped>
> To unsubscribe, e-mail <>
> Trouble unsubscribing? Try:

Re: Tables in MySQLRob12 Jun