List:General Discussion« Previous MessageNext Message »
From:Ilya Kazakevich Date:February 10 2010 8:30pm
Subject:RE: how things get messed up
View as plain text  
Hello John, 

There was a joke in russian PHP club: "why strore images in database? Are
you going to have a full text search on them?"

IMHO:
Storing files in DB is probably bad idea.
Here are some advantages of storing files on filesystem:
1) Theoretically it takes more time to load data from database then to load
it directly from filesystem if data is huge (we are not speaking about
databases, installed on raw disks)
2) while storing files on fs, you can access them via web server and ftp and
network fs and what-ever-you-like. Almost all servers for your platform
supports filesystem, but not database.
Yes, its possible to create script to access data from db via web, but you
have to create it and support HTTP cache (modified-sience, e-tag and so on)
and you also need to store modification date in db (to make this cache work)
and so on. If you store files as files -- you have filesystem and servers to
access them. If you use db as filesystem -- you should create servers by
yourself. Looks like invention of wheel.
3) Tables with BLOB are not so flexible, i.e. HEAP tables do not support
such fields. So, in some situation you'll have "filesort" that you cannt
fix.
4) You can not easly do "SELECT *" from table with 10.000 rows if each row
contains field with 200Kb pdf file. 

And there is only one disadvantage: consistency. When deleting user -- all
her data could be deleted automatically using foreign keys, but you have to
do it in your app if files are stored externally.

If somebody knows more reasons to store files in DB -- post it here, please.
It would be interesting. Thanks.

Ilya.

-----Original Message-----
From: John G. Heim [mailto:jheim@stripped] 
Sent: Wednesday, February 10, 2010 10:09 PM
To: mysql@stripped
Subject: how things get messed up

About 5 years ago, I was asked to write a php app for my department. The app
keeps track of graduate school applicants to my department at the
university. The main data elements are the scores each professor gives to
each applicant. There are only about 400 applicants each year so even with
all the personal data, scores, transcripts,  etc for each student, it's not
much. for the first 2 years, it was under a meg of data. Well, then the
selection committee asked me to add something so that if a student e-mailed
the department a document, say a paper he'd written or a photo of himself,
or whatever, it could be tacked on to the info they saw about him while
grading the applicant.

So I said, "Well, there is only going to be maybe 10 or 20 of those a year. 
And even if all 400 applicants submit a PDF of a paper they'd written, it
would be only 400 docs. 4,000 after 10 years. Yeah, lets just create a
documents table in the database and store them in mysql."

For the first 2 years, only 2 students sent in documents to attach to their
application. I figured I'd wasted my time. Then the next year, the graduate
school changed their  web application form to allow students to upload
documents. "Fine," I said, "My worst case scenario has already come true. 
But, well, this is why you plan for the worst case."

Then they started taking letters of recommendation as PDF documents. In
fact, they started requiring PDF docs. Each student has 3 to 6 letters of
recommendation. All in all, I figure we're at about 100 times as many docs
in our database as I originally expected and about 10x my worst case
scenario.

I should either be fired or shot.  Maybe fired *then* shot. Actually, its
not as bad as all that. I can pretty easily write a perl script to export
the docs to files and access them via a network mounted filesystem. After
all, saving myself 5 hours of work 5 years ago is worth what? -- maybe
10hours today? It is amazing how often quick & dirty turns out just being
dirty in the end.




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:
http://lists.mysql.com/mysql?unsub=1

Thread
how things get messed upJohn G. Heim10 Feb
  • Re: how things get messed upBuford Tannen10 Feb
  • RE: how things get messed upIlya Kazakevich10 Feb
    • Re: how things get messed upCui Shijun11 Feb
      • RE: how things get messed upJerry Schwartz11 Feb
        • Re: how things get messed upJohan De Meersman12 Feb
    • Re: how things get messed upJoerg Bruehe11 Feb
    • Re: how things get messed upJochem van Dieten11 Feb
  • Re: how things get messed upMartijn Tonies11 Feb
    • Re: how things get messed upDavid Giragosian11 Feb
  • RE: how things get messed upJerry Schwartz11 Feb
  • Re: how things get messed upMartijn Tonies11 Feb
    • RE: how things get messed upJerry Schwartz11 Feb
    • Re: how things get messed upAnn W. Harrison12 Feb
  • Re: how things get messed upMartijn Tonies12 Feb
    • Re: how things get messed upJohan De Meersman12 Feb
  • Re: how things get messed upMartijn Tonies12 Feb
    • Re: how things get messed upJohan De Meersman12 Feb
      • Re: how things get messed upVikram A12 Feb
        • Re: how things get messed upJohan De Meersman12 Feb
          • Re: how things get messed upVikram A12 Feb
            • RE: how things get messed upJerry Schwartz16 Feb
              • RE: how things get messed upMartin Gainty16 Feb
                • Re: how things get messed upBarry Leslie17 Feb
              • Re: how things get messed upVikram A18 Feb
                • RE: how things get messed upJerry Schwartz18 Feb
                  • Re: how things get messed upJohan De Meersman18 Feb
  • Re: how things get messed upMartijn Tonies15 Feb
  • Re: how things get messed upMartijn Tonies15 Feb
    • Re: how things get messed upJohan De Meersman16 Feb
      • Re: how things get messed upPaul McCullagh17 Feb
  • Re: how things get messed upMartijn Tonies15 Feb
    • Re: how things get messed upAnn W. Harrison15 Feb
      • Re: how things get messed upJohan De Meersman16 Feb
      • RE: how things get messed upJerry Schwartz16 Feb
  • Re: how things get messed upMartijn Tonies18 Feb
RE: how things get messed upJerry Schwartz22 Feb
  • Re: how things get messed upVikram A24 Feb