List:General Discussion« Previous MessageNext Message »
From:Siegfried Heintze Date:July 25 2005 11:42pm
Subject:RE: Alternatives to performing join on normalized joins?
View as plain text  
Shawn (and anyone else who will listen):

 

I'm already running out of RAM (actually, virtual memory/page file space)
just trying to display all the job titles without even joining them with
anything. I have to use a LIKE clause to just get a portion of them.

 

So, I could:

(1)     Have multiple database connections going concurrently where the
first one joins everything except the keywords. As I'm iterating thru the
first result set with the fetch function, I could get a list of keyword
foreign keys for each row with a second database connect and store this in a
second result set. Is this a common approach? Are secondary database
connections cheap?

(2)     I could try to store the first join in a hashmap first and then
iterate but I've already demonstrated that the hashmap takes too much
memory.

(3)     I could create a new column of type string for each job title. This
would contain a comma separated list of integer foreign keys for the
keywords. This is the non-normalized option and you discouraged this
approach.

(4)     I could have a very wide result set. Let assume I have a jobtitle
(joined with a job posting and company) with 26 keywords. That means 26 rows
in the result set are identical except the keyword foreign key (fk) column.
I have to then insert the logic to detect the fact that everything except
the keyword fk column is identical. Are you advocating this approach? It
seems like it requires a lot of computer space and computer time and (worst
of all) my time. I believe this is the classical approach, however.

 

Which would you choose?

 

Thanks,

 

Siegfried

 

  _____  

From: SGreen@stripped [mailto:SGreen@stripped] 
Sent: Monday, July 25, 2005 9:31 AM
To: Siegfried Heintze
Cc: mysql@stripped
Subject: Re: Alternatives to performing join on normalized joins?

 



"Siegfried Heintze" <siegfried@stripped> wrote on 07/24/2005 11:35:36 AM:

> I have a large number of job titles (40K). Each job title has multiple
> keywords making a one-to-many parent-child relationship.
> 
> If I join job title with company name, address, company url, company city,
> job name, job location, job url (etc...) I have a mighty wide result set
> that will be repeated for each keyword.
> 
> What I have done in the past (in a different, much smaller, application)
is
> perform a join of everything except the keyword and store everything in a
> hashmap. 
> 
> Then I iterate thru each wide row in the hashmap and perform a separate
> SELECT statement foreach row in this hashmap to fetch the multiple
keywords.
> 
> Whew! That would be a lot of RAM (and paging) for this application.
> 
> Are there any other more efficient approaches?
> 
> Thanks,
> Siegfried
> 
> 

There are two major classes of efficiency when dealing with any RDBMS: time
efficiency (faster results), space efficiency (stored data takes less room
on the disk). Which one are you worried about? 

If it were me, I would start with all of the data normalized: 
        * a Companies table (name, address, url, city, etc) 
        * a Job Titles table (a list of names) 
        * a Keywords table (a list of words used to describe Job Titles) 
        * a JobPosting table ( Relates Companies to Job Titles. Should also
be used to track things like dateposted, dateclosed, salary offered, etc.) 
        * a Postings_Keywords table (matches a Posting to multiple
Keywords). 

I would only denormalize if testing showed a dramatic improvement in
performance by doing so. I would think that the Job Title to Keyword
relationship would be different between Companies. One company posting for a
"Programmer" may want VB while another wants PHP and PERL. By associating
the Keywords with a Posting (and not just the Job Title), you can make that
list Company-specific. 


Shawn Green
Database Administrator
Unimin Corporation - Spruce Pine 







Thread
missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aJason Pyeron4 Jul
  • Re: missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aGleb Paharenko4 Jul
    • Re: missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aJason Pyeron4 Jul
      • Re: missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aGleb Paharenko5 Jul
        • Re: missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aJason Pyeron5 Jul
        • Re: missing ibd files on upgrade of mysql from 4.1.9 to 4.1.12aJason Pyeron5 Jul
      • Documentation on Like clauseSiegfried Heintze22 Jul
        • Re: Documentation on Like clausePaul DuBois23 Jul
          • How to use Like Clause in Perl? Works fine in MySQL control center!Siegfried Heintze23 Jul
            • Re: How to use Like Clause in Perl? Works fine in MySQLcontrol center!Frank Bax23 Jul
              • Alternatives to performing join on normalized joins?Siegfried Heintze24 Jul
                • Re: Alternatives to performing join on normalized joins?SGreen25 Jul
                  • RE: Alternatives to performing join on normalized joins?Siegfried Heintze26 Jul
                    • How to select first 1000 records like MySQL Control Center 0.9?Siegfried Heintze26 Jul
                      • Re: How to select first 1000 records like MySQL Control Center 0.9?Peter Brawley26 Jul
            • Re: How to use Like Clause in Perl? Works fine in MySQL control center!Jeremiah Gowdy25 Jul
RE: How to use Like Clause in Perl? Works fine in MySQL control center!John Trammell25 Jul
  • Where did my disk space go?Siegfried Heintze4 Aug
    • Re: Where did my disk space go?Sebastian4 Aug
      • RE: Where did my disk space go?Siegfried Heintze4 Aug
  • Re: How to use Like Clause in Perl? Works fine in MySQL control center!Nuno Pereira8 Aug
RE: Where did my disk space go?SST - Adelaide)4 Aug
  • RE: Where did my disk space go?Siegfried Heintze4 Aug
    • RE: Where did my disk space go?David Logan4 Aug
      • How to delete log files? Was: RE: Where did my disk space go?Siegfried Heintze5 Aug
        • Re: How to delete log files? Was: RE: Where did my disk space go?Gleb Paharenko5 Aug
RE: How to delete log files? Was: RE: Where did my disk space go?SST - Adelaide)5 Aug
  • RE: How to delete log files? Was: RE: Where did my disk space go?Ron Thomas5 Aug
RE: How to use Like Clause in Perl? Works fine in MySQL control center!John Trammell8 Aug