Yonghua.Wu@stripped wrote:
>
> Hi Andy,
>
> Two questions for you:
> 1) Why store so many repeatable data in the same table?
> I mean you needn't to store "purchases"/"previews" for all records.
> You can choose MySql SET datatype, or you can choose another table to
> store the action types and let the original table refers to it.
> So that you can reduce the diskspace and also the I/O.
Hmmm... not sure if the SET datatype is portable
across databases, is it?
Also, remember there will only be 2 action
types: 'purchase' and 'preview'.
Name (of visitor) will actually be an integer id,
and I could use a bool or an int for 'action' also,
since it is intended to be 2 valued. But a diff
of 1 or 2MB of storage is negligible (around extra
9 bytes per row for the 'action' field), and I want
the flexibility and readability of using the char
field.
The important concept here is the TIMESTAMP, each
access/visit needs to be recorded separately and
thus cannot be normalized away. So if there are
a hundred thousand visits, there NEEDS to be a
hundred thousand rows.
I guess the question probably boils down to:
"If a field (of whatever type) will only ever have
2 values, will indexing it bring benefits when said
field is being refered to in a WHERE clause for a
very large table?"
-------------------------------------------------
I just realized I could do 2 tables, one that
stores purchase 'visits' and another that stores
'preview' visits, but that complicates the design
and I don't think that's a good idea. Hundreds
of thousands of rows is not that hard for MySQL
to deal with, but what I want to understand is
if I make an index for 'action', will the performance
diff be on the order of, say, a query taking 2
seconds versus 0.5 seconds?