List:General Discussion« Previous MessageNext Message »
From:SGreen Date:October 19 2004 4:21pm
Subject:Re: many fields or many tables? (Understanding DB design)
View as plain text  
Timothy Luoma <lists@stripped> wrote on 10/19/2004 11:11:12 AM:

> On Oct 19, 2004, at 10:17 AM, SGreen@stripped wrote:
> > You have already gone a long way to describing your table structure by 

> > describing your data elements and their relationships to each other. 
> >  Let me try to summarize you descriptions and see if I can show you 
> > how to translate your text descriptions into table descriptions.
> Ok, I'm going to just go through and make sure I'm following your 
> translation (I've never been good at foreign languages, and SQL is 
> apparently no different ;-)
> > 1. There are things called "projects".
> Yes... (FPP and WW) although of course all of the projects fall under 
> one meta-project (TiM), but I guess that's taken care of by the fact 
> that this DB will contain only information about that one meta-project.
> > 2. Some projects have "sub-projects." (I will assume that there is at 
> > most 1 parent project per sub-project)
> Yes. (I am thinking here of FPP1 and FPP2 and FPP3).  So far WW has 
> only one "sub-project" (WW1) but we expect there will be more 
> eventually.

But in reality, the only difference in a project and a sub-project is that 
there is a "parent" to a sub-project. That's why I didn't differentiate 
between them later on.

> > 3. Some projects contain groups.
> I would have said (to use your wording) some "sub-projects" contain 
> "groups"... otherwise I think I'm confused what the difference is 
> between a sub-project and a group.
> I'm thinking of it this way:
> fpp --> fpp1    --> fpp11
>             --> fpp12
>             --> fpp13
>             --> fpp14
>    --> fpp2   --> fpp21
>             --> fpp22
>             --> fpp23
> are you saying that I ought to be thinking of it this way
> fpp --> fpp1
>    --> fpp11
>    --> fpp12
>    --> fpp13
>    --> fpp14
>    --> fpp2
>    --> fpp21
>    --> fpp22
>    --> fpp23

No, you had it right the first time. Remember, the only difference between 
a project and a sub-project is whether or not it has a parent project. It 
has nothing to do with how it's stored in the database.

> > 4. All projects contain people.
> Yes.
> > 5. Some people assigned to projects also belong to one or more groups.
> Here's where it starts to get fuzzy.
> You have to be in one of FPP1 or FPP2 or FPP3 or WW1 (we'll call it WW1 
> even though there isn't a WW2 yet).
> You can be in FPP1 (or FPP2 or FPP3) and WW1.
> If you are in FPP11 you can't be in FPP12 or FPP13 or FPP14 or FPP2 or 
> FPP21 or FPP22 or FPP23.  Those final groups are exclusive and required 
> (if you are in FPP1 then you must be in one and only one of FPP11 or 
> FPP12 or FPP13 or FPP14).

This type of mutual exclusion is currently beyond the realm of MySQL DDL. 
You will have to enforce that business rule in your application code, not 
with the database design.

> I've been thinking about it like students at a university (DB) which 
> has several colleges (projects), and each college has graduates of a 
> particular "class" (sub-project).

I agree but I see it more as a school(university) containing other schools 
(colleges) with each school having zero or more classes (freshman, 
sophomore, etc). 

In this design the top-most school is the university, it has no parent 
schools. The next tier schools are colleges, each of them is the child to 
a university. 

A university-type school has no child "classes". Each college-type school 
as 4 "classes": freshman, sophomore, junior, senior.

You have one "master" project, FPP. There are "child" projects: FPP1 and 
FPP2. Both FPP1 and FPP2 are children to FPP.  FPP1 is the parent to the 
following groups: FPP11,FPP12,FPP13,FPP14. FPP2 is the parent to the 
following groups: FPP21,FPP22,FPP23

However, a "project" is a "project" and a "group" is still a "group". The 
only difference between those and sub-projects and sub-groups are whether 
or not they get "parent"s.

> The university has no current students or alumni who aren't from a 
> particular college and a particular year.
> Some classes (sub-projects) are broken down into further groups...
> > We need to look at #4 before we get to #3. Because a project must have 

> > people, that implies that there is a "person" thing in your system. 
> > Create a table to store information about a "person"
> >
> > CREATE TABLE person (
> >         id int auto_increment primary key,
> >         First Name varchar(20) not null,
> >         Last Name varchar(20) not null,
> > ... more person-related fields and indexes ...
> > )
> Now should that table have all the information about persons (name, 
> address, email.....) ?

This is the "tip" of the "person" iceberg. You can create as complex a 
structure as you need in order to completely document a "person" in your 
system. Analyze your need for "person" information by describing them just 
as you described your project planning needs. However, all of the 
information can be summarized by that one ID from this one table, 

> That was what I was originally thinking needed to be spread out into 
> separate tables.

You probably will. Start with a text description like we did here and your 
"person" tables will probably describe themselves.

> > Now, #4 also states that each project can have 0 or more people 
> > assigned to it. What it didn't say, but should have, was whether each 
> > person can be assigned to more than one project. I know that most 
> > people do work on more than one project at a time or will be assigned 
> > to a new project after the old one is over. You have a many-to-many 
> > relationship between your people and your projects
> It is possible that someone could be in only one project (ever) or that 
> they might be involved in more than one.  We want to leave the door 
> open for the 2nd option, although it will probably be more rare that 
> someone is in more than 1 (however, we already have some and will no 
> doubt have others).

That's why I called it a many-to-many. Each person can belong to many 
projects, each project is composed of many people.

> > By declaring that the COMBINATION of the values person_id and 
> > project_id must be UNIQUE, you guarantee that nobody is assigned to 
> > the same project more than once.
> Ah, so I would assume I could do the same for sub-sub-projects (FPP11, 
> FPP12, etc)?

Yep, just set the parent of a project to be the child of another project 
and you have created a sub-sub-project (grandchild).

> > Looking at #3...I get the impression from the description that each 
> > group is specific to a single project and that each project can have 
> > zero or more groups (a project could just have people that aren't in 
> > any groups or no groups at all).
> The projects (FPP and WW) will always have sub-projects.  People have 
> to belong to at least one project (FPP) and only one sub-project of 
> that project (FPP1 or FPP2 or FPP3).  Further division beyond that is 
> possible *and* if possible it is mandatory that everyone is in one and 
> only one.

Once again, that rule about "belonging to only one of ..." is something 
you will have to enforce in your application code. The database can't do 
it all.

> > You may have noticed that while we have related a "person assigned to 
> > a project" to a "project group" however there is nothing in our data 
> > definitions that will prevent you from assigning a person assigned to 
> > one project to a group assigned to a different project. That bit if 
> > business rule enforcement must come from your application. The 
> > database can do a lot but it won't do everything.
> Ah, so I could theoretically mis-assign someone from WW1 to FPP11.

Yes, you could. That is what the warning was all about.

> > Does this help you get started?
> YES!  Many thanks for your help and your time.  I wasn't even really 
> thinking about the person-to-project aspect as much as the 
> person-to-their-information aspect.
> TjL
> ps - is it common that I found it more helpful to take a pen and a 
> piece of blank paper to outline this than try to do this on the 
> computer?  That doesn't happen to me often... of course it's much 
> harder to share via email :-)

Yes, it is normal to make a paper model of your entity relationships 
BEFORE working on the actual data structures. I encourage everyone 
starting with relational databases to go through this "paper model" stage. 
Not only does it help to organize any thoughts and ideas but it will help 
when it comes time to translate those object relationships into SQL 
structures. How it gets done(text description, ERD, UML, ...) doesn't 
matter. What is important is that you take the time to think it through at 
least once before you start your database development.

Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine
many fields or many tables? (Understanding DB design)Timothy Luoma18 Oct
  • Re: many fields or many tables? (Understanding DB design)SGreen18 Oct
    • Re: many fields or many tables? (Understanding DB design)Brian18 Oct
    • Re: many fields or many tables? (Understanding DB design)Timothy Luoma18 Oct
      • Re: many fields or many tables? (Understanding DB design)Brian19 Oct
      • Re: many fields or many tables? (Understanding DB design)SGreen19 Oct
        • Re: many fields or many tables? (Understanding DB design)Timothy Luoma19 Oct
          • Re: many fields or many tables? (Understanding DB design)SGreen19 Oct