On 04/17/2007 04:18 AM, Fabian Köhler wrote:
> Thanks for all the input. The problem i have with this idea:
>
>
>> Really? Wow, my opinion is that you're trying to do in one table what
>> you should do in two. Have a questions table, and an answers table. The
>> answers table would have a column specifying which question they belong
>> to. i.e.
>>
>> QUESTIONS
>> int autoincrement questionID
>> question
>>
>> ANSWERS
>> int autoincrement answerID
>> int questionID
>> answer
>>
>
> is, that
> 1) i must define the column with the value of the answer as a TEXT to
> cover all possible answers, altough a SMALLINT might fit it better. but
> i can live with that
>
Not a big deal. Any performance increase will be marginal, and the ease
of having a standard column type in my opinion outweighs any performance
decrease.
> 2) i think it's slow when i have lots of data. i'm currently not at home
> to do performance tests, but let's assume i have 2000 questions, then i
> collect by user input 2000 answers to every question by 10000 different
> users (a realistic scenario for our application) and store it in a table
> like this:
>
>
> ANSWERS
> int autoincrement answerID
> int userID
> int questionID
> text answer
>
> i then have 2000*10000 = 20 mill. rows in ANSWERS. now let's assume i
> want to select all answers to all questions for 100 random users at
> once. The only solution i see here for one query is a large join which
> will take really, really long to get the data even with proper indexes
> or did i do sth. wrong?
>
>
How is this not a processor intensive operation no matter what? If
you're collecting this much data, you have to deal with that much data,
the key is to arrange is to that the database knows something about the
relationships between the data (questions * answers) in this case so
that you can generate efficient queries.
> is there a way to provide a view on these two tables (QUESTIONS and
> ANSWERS) that simulate the original table i intended, which is actually
> fast even with large amounts of data?
>
Yes, with either a VIEW or a JOIN.
> Are other dbs like oracle better on doing joins on such large tables?
>
>
I can't speak for MySQL 5.0 because it's been a while since I've done
performance tests, but the old 3.x versions of MySQL would perform much
better than oracle with simple joins.
> So i really like the idea of having one large table with many columns
> because it's really fast and i can define each column properly but i
> currently see no option to go this way but with that way there will
> alway be a limit to the max. amount of answers in the system.
>
>
If you're doing that, why not just make a flat-file database of the
answers? That would be much faster still than dealing with the added
overhead of a SQL server. You're ignoring the advantages of having a SQL
setup in this case anyhow.
-Micah