On Fri, 9 Jul 2004 14:45:41 +0200, Jigal van Hemert <jigal@stripped> wrote:
> From: "Martijn Tonies" <m.tonies@stripped>
>> Design for understanding, logic and maintenance, not performance.
>> If you need more performance, throw more hardware at it -
>> a larger cache (settings -> memory), faster disks and a faster CPU.
> Sorry, but I can't agree with you. Years ago I had to put the DMOZ
> (http://www.dmoz.org/) database (2 million records, 100,000 or so categories
> at the time) in a MySQL database. Next we had to calculate the number of
> sites in a certain category. The 'path' to the category was known, but a
> regexp was needed to select the path of one level up. The query took > 30
> After adding a column for "one level up", adding indexes, optimizing the
> query it took only a few hundreds of seconds.
Maybe I misunderstand the problem, but I get the impression you have
the category computers>internet>providers>adsl and you want to count
everything in computers>internet>providers. Isn't that just a BETWEEN
'computers>internet>providers' AND 'computers>internet>providers>z'
which a B+tree is supposed to handle just fine?
> I really don't know how much hardware you would like to use to get these
For DMOZ data I don't want hardware, I want GiST indexes: