List:General Discussion« Previous MessageNext Message »
From:Peter Grigor Date:February 18 2003 5:03pm
Subject:Re: Slow retrieval of distinct on indexed fields
View as plain text  
I grant you that, however my point was about the total count of the indexed
word's appearance, which would still stay the same, regardless of the
encoding of the position information.

Peter
<^_^>
---------------------------------------------
Peter Grigor
Hoobly Free Classifieds
http://www.hoobly.com


----- Original Message -----
From: "Dan Nelson" <dnelson@stripped>
To: "Peter Grigor" <pgrigor@stripped>
Cc: "Renars Jeromans" <renars.jeromans@stripped>; <mysql@stripped>
Sent: Tuesday, February 18, 2003 11:12 AM
Subject: Re: Slow retrieval of distinct on indexed fields


> In the last episode (Feb 18), Peter Grigor said:
> > I've always thought that adding a count of index entries to the index
> > file would be helpful.
> >
> > Like (generic index entry): word:15:1,2,5,7,etc...
> >
> > where word is the indexed value, 15 is the number of times it
> > appears, and 1,2,5,7 is the record offsets (or byte offsets for
> > dynamic stuff) of the appearances of the indexed value.
> >
> > This way, getting a count of indexed values would be quick, and
> > scanning the index would no longer be necessary, because the count
> > (15) would tell you how many index entries to skip to get the next
> > indexed value in the index.
>
> What you describe above is similar to a bitmap index; they trade off a
> more complicated encoding scheme for extremely fast counts.  Storing
> the record offset is sort of inefficient; that's 4 (8?) bytes per
> record.  If you either store a bitmap of rows that that value occurs in
> (i.e. 1100101 in your example), or delta record/byte offsets
> (1,+1,+3,+2,0), and then compress those, you can shrink your indexes to
> an incredibly small size.  Compressed indexes take a lot longer to
> update, though.
>
> See
http://lists.mysql.com/cgi-ez/ezmlm-cgi?1:mss:125271:200211:bmokiphdlmibimfl
jjkf
> for a description of how I would implement bitmap indexes in MySQL .
>
> --
> Dan Nelson
> dnelson@stripped
>

Thread
Slow retrieval of distinct on indexed fieldsRenars Jeromans18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsDan Nelson18 Feb
    • Re: Slow retrieval of distinct on indexed fields [OT?]Michael T. Babcock18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsPeter Grigor18 Feb
    • Re: Slow retrieval of distinct on indexed fieldsDan Nelson18 Feb
  • Re: Slow retrieval of distinct on indexed fieldsPeter Grigor18 Feb