MySQL Lists are EOL. Please join:

List:General Discussion« Previous MessageNext Message »
From:SGreen Date:July 22 2004 4:06pm
Subject:Re: Database design questions
View as plain text  
Big post = long response. Thank you for your patience.

When it comes to DB design I like to first identify the basic objects: 
List owners (the people who manage a group), subscribers (the people in 
the group), and the groups themselves. I think you equate a list owner to 
a "User" of your service:

CREATE TABLE user (
        ID int auto_increment not null
        , FullName varchar(25) not null
        , Email varchar(255) not null
        , Password varchar(20) not null
        ..... additional fields for business purposes
        , KEY (ID)
        , PRIMARY KEY (...)
)

For your Primary key I would pick either Email or if you have a different 
Login field, use that.

You have a "subscriber", this is someone on one of the mailing lists. This 
person may or may not be one of your users. Those are separate identities.

CREATE TABLE subscriber (
        ID int auto_increment not null
        , EmailAlias varchar(255) not null
        , EmailAddress varchar(255) not null
        , DoNotMail tinyint not null default 0
        ... other fields as needed
        , KEY(ID)
        , PRIMARY KEY (EmailAddress)
)

The "alias" portion of your email address is "Scott Haneda". The address 
portion is "lists@stripped". The primary key prevents the same address 
from being listed more than once. If you want each user to have their own 
independent subscriber list you would need a table like:

CREATE TABLE subscriber (
        ID int auto_increment not null
        , user_ID int not null
        , EmailAlias varchar(255) not null
        , EmailAddress varchar(255) not null
        , DoNotMail tinyint not null default=0
        ... other fields as needed
        , KEY(ID)
        , KEY(user_Id)
        , PRIMARY KEY (EmailAddress, user_ID)
)

This would let multiple users choose to include someone in one of their 
lists independently of any other user's list.

You have a "group":

CREATE TABLE group (
        ID int auto_increment not null
        , user_ID int not null
        , Name varchar(32) not null
        , KEY(ID)
        , KEY(user_ID, ID)
        , PRIMARY KEY (Name, user_ID)
)

The compound key (user_ID, ID) will let the engine collect a list of all 
of the IDs for a user without actually reading the table,it will get the 
data directly from the index. The primary key prevents any user from 
creating the two groups with the same names but allows another user to 
re-use that name.

When a subscriber is added to a group, that is a "subscription". They are 
subscribing to a mailing list:

CREATE TABLE subscription (
        subscriber_ID int not null
        , group_ID int not null
        , Unsubscribe tinyint not null default 0
        , KEY (subscriber_Id, group_ID, unsubscribe)
        , KEY (group_ID, subscriber_Id, unsubscribe)
        , PRIMARY KEY(group_ID, subscriber_ID)
)

The primary key will prevent the same subscriber from ending up in the 
same group more than once. The other compound keys (subscriber_Id, 
group_ID, unsubscribe) and (group_ID, subscriber_Id, unsubscribe) are 
designed to answer the two most critical questions you have:
        a) list all of the active/inactive subscribers for a group
        b) list all of the groups in which a subscriber is active/inactive

Yes, it duplicates your data storage but since this is the "central" table 
to your entire application, I firmly believe that the performance boost 
will far outweigh the need for extra disk space.

Now, you asked how to bulk insert subscribers (based on uploaded lists) so 
that you preserve the "Unsubscribe" and "DoNotMail" flags. It seems that a 
simple INSERT IGNORE to the correct table will take care of that. I assume 
that you preload your lists in to a temporary table.

INSERT IGNORE subscriber (EmailAlias, EmailAddress, ... other business 
fields ...)
SELECT Name, Address, ... other business values ...
FROM tmpTableNameHere

The primary key prevents you from adding the same address twice so you 
will not change the DoNotMail value if one already exists.

Adding subscribers to groups in bulk means you also need check the 
DoNotMail value so that people who don't want mail, won't get any. 
Assuming the same temporary table and that the subscribers are being added 
to group 9.

INSERT IGNORE subscription (subscriber_ID, group_ID, Unsubscribe)
SELECT s.ID, 9, s.DoNotMail
FROM subscriber s
INNER JOIN tmpTableNameHere tmp
        ON tmp.Email = s.EmailAddress

This adds "blocked" subscribers to a group but pre-sets those that do not 
want mail as "Unsubscribed". The primary key will prevent overwriting any 
subscriber/group combination that already exists.

When anyone wants to change their status to DoNotMail, you will need to do 
only 1 update:

UPDATE subscriber s 
        INNER JOIN subscription sub
        ON s.id = sub.subscriber_ID
SET s.DoNotMail =1, sub.Unsubscribe = 1
WHERE EmailAddress='user@stripped'

Both flags are now set.

Does this help at all?  I think you were confusing your users with your 
list subscribers and making this design much harder to manage than it 
needed to be.

Yours,
Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine






Scott Haneda <lists@stripped> wrote on 07/21/2004 09:36:04 PM:

> I am stuck on how to design a set of tables.  I hope someone can help 
me.  I
> am using mysql 4.0.18-standard.
> 
> A little background, this is a mailing list manager, it support multiple
> account holders.  User_id is the account holders id, this is unique auto 
inc
> and carried across all table to make sure I am only working on one
> particular account holders data.  Email addresses can be "grouped" for
> organizational purposes only, so you could have groups of "friends",
> "family", "enemies".
> 
> My trouble comes in with the dealing of unsubscribes and bounces.  If I
> recipient ever wants to unsubscribe I want to mark them so across all
> groups, but I want to keep the email address on file marked that way so 
that
> they will never ever be emailed again.  Mainly this is for bounces, if 
the
> email address bounces, it is a dead address, I don't ever want to deal 
with
> that address again.
> 
> My first effort at this is with 2 tables, groups and addresses.  If I 
want
> to get all addresses from a group for a particular account, I simply 
SELECT
> email_address from addresses where group = "family" and user_id = '123';
> Inserting unique rows in this scenario is simple as well, but it allows 
for
> a email address to exist in more than one group at a time.  In this
> scenario, when I add more addresses in batch, I use INSERT IGNORE, and 
since
> I have a unique index on the address and group, if the email address 
already
> exists in that group, it maintains the unsub status and happily ignores 
that
> insert, otherwise, it is not unique and it will insert it.
> 
> This is flawed in it is now impossible to globally across all groups 
have a
> email address that can no longer be inserted and always marked as
> unsubscribed.  I could update addresses set status = 0 where user_id = 
'123'
> which would in fact mark all those email addresses unsubscribed, but if 
a
> new group were created, a account holder could potentially get the
> unsubscribed address into the new group.  One thought, would be on the
> creation of a new group, to select all the unsubscribed addresses into 
that
> group, but that gets messy and replication of data is starting to 
happen.
> 
> I think I need to drop the "groups" idea somewhat.  If every address was
> unique within a account holder/user_id, this is a solved problem, but I 
am
> at a loss as to how to still group the addresses both cosmetically as 
well
> as for the purposes of sending emails to to them.  If the group_id was
> treated as a list of id's in a field, I think this would solve my 
problem,
> but I am not able to figure out if mysql has a field type to do what I 
want,
> the "set" does not seem to be it.  If the email address was tied to 
groups
> with multiple id's in one field, this should work:
> User_id     email_address       group       status
> 123         foo@stripped     1,4,6,8     1
> 123         bar@stripped     2,6,8,9     1
> 
> I think I can then sort out my groups with WHERE int in group.  But this
> opens me up to one other issue, I have a upload system where new 
addresses
> are uploaded and imported in with the building up of insert statements. 
In
> short, if the email address does not exist, I need to insert it, that 
part
> is simlpe, the account holder chooses a group they want to upload into 
and
> it would get inserted as SET email_address = x@stripped, group = '5', or
> whatever the group is.  The trouble comes in the conditions that need to 
be
> thought of when a email address already exists, IF status = 0 ignore 
that
> address, we don't want to insert, IF status = 1, since it is already 
there,
> I only want to append/update the group id's to have the additional group
> set.  Of course, there is chance the group id is the same due to account
> holder mis management of addressees, so that needs to be taken into 
account
> as well.
> 
> The other problem is I only want to INSERT the addresses, I do not want 
to
> do any selecting ahead of time, this is too much overhead.  Account 
holders
> will be sloppy, they will add new addresses to their internal address 
list,
> then they will submit the whole batch to me again, I need to only add 
the
> new ones, so these lists will get large, to select that many times is 
just
> too much.
> 
> Any idea on how to best approach this, either with my ideas or a totally 
new
> method are appreciated.
> -- 
> -------------------------------------------------------------
> Scott Haneda                                Tel: 415.898.2602
> http://www.newgeo.com                       Fax: 313.557.5052
> scott@stripped                            Novato, CA U.S.A.
> 
> 
> 
> -- 
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql?unsub=1
> 

Thread
Database design questionsScott Haneda22 Jul
  • Re: Database design questionsSGreen22 Jul