I'm not a wizard on the MySQL optimizer, but here's my go...
> > Two questions for you:
> > 1) Why store so many repeatable data in the same table?
> > I mean you needn't to store "purchases"/"previews" for all records.
> > You can choose MySql SET datatype, or you can choose another table to
> > store the action types and let the original table refers to it.
> > So that you can reduce the diskspace and also the I/O.
> Hmmm... not sure if the SET datatype is portable
> across databases, is it?
> Also, remember there will only be 2 action
> types: 'purchase' and 'preview'.
> Name (of visitor) will actually be an integer id,
> and I could use a bool or an int for 'action' also,
> since it is intended to be 2 valued. But a diff
> of 1 or 2MB of storage is negligible (around extra
> 9 bytes per row for the 'action' field), and I want
> the flexibility and readability of using the char
> The important concept here is the TIMESTAMP, each
> access/visit needs to be recorded separately and
> thus cannot be normalized away. So if there are
> a hundred thousand visits, there NEEDS to be a
> hundred thousand rows.
> I guess the question probably boils down to:
> "If a field (of whatever type) will only ever have
> 2 values, will indexing it bring benefits when said
> field is being refered to in a WHERE clause for a
> very large table?"
I guess it depends on what value you are querying for :)
out of 1,000,000 rows, if there's 1000 with a value
of "preview", it should be able to use the index and
fetch the 1000 rows. The other way around, meaning
999,000 rows, it should not use the index, as it wouldn't
But hey, that's up to MySQL to decide.
The easiest thing to do is to create a table with a lot
of rows, try your queries with and without indices and
read the query plan.
Database Workbench - development tool for MySQL, and more!
Database development questions? Check the forum!
> I just realized I could do 2 tables, one that
> stores purchase 'visits' and another that stores
> 'preview' visits, but that complicates the design
> and I don't think that's a good idea. Hundreds
> of thousands of rows is not that hard for MySQL
> to deal with, but what I want to understand is
> if I make an index for 'action', will the performance
> diff be on the order of, say, a query taking 2
> seconds versus 0.5 seconds?