On Mon, Nov 18, 2002 at 05:38:00PM +0200, Egor Egorov wrote:
> Friday, November 15, 2002, 7:25:27 PM, you wrote:
> NN> Assume I have a mysql table (myisam most likely) with a few hundred
> NN> thousand rows in it. One of the columns indicates success or failure.
> NN> 99.9% of the rows will have "0" in that column. But a small number will
> NN> have 1. I need to be able to fetch those rows quickly, without slowing
> NN> everything else down, but ideally without doing a full table scan.
> NN> I can create an index on that column, but I am under the impression that
> NN> this a really bad/slow type of index to create/maintain, since one of
> NN> the values will cover most of the table.
> NN> I'd like to be able to say something like:
> NN> create index failures on dumps(status) where status!=0;
> NN> If the sql query being run isn't compatible with the restriction on the
> NN> index, then it cannot be used. For example, if I query for status=2, it
> NN> would be ok, but status=0 would not be able to use the index. Simpler
> NN> may be to only allow the index to be used if the query contains exactly
> NN> the same restriction. i.e. the "where status !=0" index could only be
> NN> used if I had "status != 0" in my select query.
> NN> Or alternatively, if you can suggest some other means for accomplishing
> NN> this efficiently...
> NN> (Yes, I know I can make a temporary or results table updated
> NN> periodically, which I will likely do in the meantime, but would be nice
> NN> to have an efficient way of accomplishing this with live data.)
> If I've got you right status can have values 0 or 1. In this case
> you can just use " SELECT ... WHERE status=1 .." (index wil be used)
> or "SELECT .. WHERE status=0 .." (index will not be used, because
> scan the whole table will be faster to retrieve 99,9% of rows)
> depends on what you want to get.
I'd like to second Nathan's request.
Just because MySQL is smart enough to not use an index when 99% of the
rows would match doesn't mean that this is an unnecessary request.
It'd be a great optimization it MySQL could "know" not to bother
indexing those records. It'd save a lot of space and CPU time on
larger data sets.
Jeremy D. Zawodny | Perl, Web, MySQL, Linux Magazine, Yahoo!
<Jeremy@stripped> | http://jeremy.zawodny.com/
MySQL 3.23.51: up 104 days, processed 2,259,137,221 queries (250/sec. avg)