List:General Discussion« Previous MessageNext Message »
From:Peter Grigor Date:February 18 2003 3:28pm
Subject:Re: Slow retrieval of distinct on indexed fields
View as plain text  
I've always thought that adding a count of index entries to the index file
would be helpful.

Like (generic index entry): word:15:1,2,5,7,etc...

where word is the indexed value, 15 is the number of times it appears, and
1,2,5,7 is the record offsets (or byte offsets for dynamic stuff) of the
appearances of the indexed value.

This way, getting a count of indexed values would be quick, and scanning the
index would no longer be necessary, because the count (15) would tell you
how many index entries to skip to get the next indexed value in the index.

Peter
<^_^>
---------------------------------------------
Peter Grigor
Hoobly Free Classifieds
http://www.hoobly.com


----- Original Message -----
From: "Dan Nelson" <dnelson@stripped>
To: "Renars Jeromans" <renars.jeromans@stripped>
Cc: <mysql@stripped>
Sent: Tuesday, February 18, 2003 10:14 AM
Subject: Re: Slow retrieval of distinct on indexed fields


> In the last episode (Feb 18), Renars Jeromans said:
> > This is not an urgent problem, but it has always intrigued me... It's
> > simplified case, but it makes the point. Let us assume that we have a
> > table
> >
> > create table T (id unsigned int unique, name char(10));
> > create index i_name on T(name);
> >
> > Let's insert into it say 5 mil rows with name field being just say 5
> > distinct values AAA, BBB, CCC, DDD, EEE.
> >
> > Now the question. Why a query like
> >
> > select distinct name from T;
> >
> > takes about 3 sec to return just a bunch of rows? As I understand it,
> > index contains all these 5 values, so just simple lookup into index
> > should take fractions of a second. Can anyone comment on this? MySQL
> > team?
>
> You need to walk the entire index to make sure you have all the values.
> There might be a single "AAB" inbetween those million "AAA"'s and
> million "BBB"'s.
>
> It requires a full index scan, which is usually a lot faster than a
> full table scan, but it is still not instantaneous.  In your example it
> won't be any quicker since your table only has two rows, but in the
> usual case where a row may be a couple hundred bytes long there is a
> bigger difference.
>
> --
> Dan Nelson
> dnelson@stripped
>
> ---------------------------------------------------------------------
> Before posting, please check:
>    http://www.mysql.com/manual.php   (the manual)
>    http://lists.mysql.com/           (the list archive)
>
> To request this thread, e-mail <mysql-thread132696@stripped>
> To unsubscribe, e-mail
<mysql-unsubscribe-pgrigor=hoobly.com@stripped>
> Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php
>
>

Thread
Slow retrieval of distinct on indexed fieldsRenars Jeromans18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsDan Nelson18 Feb
    • Re: Slow retrieval of distinct on indexed fields [OT?]Michael T. Babcock18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsPeter Grigor18 Feb
    • Re: Slow retrieval of distinct on indexed fieldsDan Nelson18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsPeter Grigor18 Feb