MySQL Lists are EOL. Please join:

List:General Discussion« Previous MessageNext Message »
From:Steve Rapaport Date:February 8 2002 10:11am
Subject:Re: Distributed Fulltext?
View as plain text  
I said:
> > Why is it that Altavista can index terabytes overnight and return
> > a fulltext boolean for the WHOLE WEB
> > within a second, and Mysql takes so long?

On Friday 08 February 2002 08:56, Vincent Stoessel wrote:

> Apples and oranges.

Yeah, I know.  But let's see if we can make some distinctions.
If, say, Google, can search 2 trillion web pages, averaging say 70k bytes 
each, in 1 second, and Mysql can search 22 million records, with an index
on 40 bytes each, in 3 seconds (my experience) on a good day,
what's the order of magnitude difference?  Roughly 10^9.

> Have you seen the /hardware/ run that enterprise with?
Irrelevant, you're unlikely to get 9 orders of magnitude difference with
faster hardware or even with clustering.

> Also, their software is optimized for full text searches and that
> is /all/ they do. Mysql is an SQL database and is optimized as such.

Absolutely granted. You are completely right. 
And I don't expect the data format to change.

 BUT:  thought experiment:

When Mysql decides to generate a FULLTEXT index, it
is using an index file, the .MYI file, which can have ANY FORMAT
it wants to.  If the .MYI format is poorly optimized for fulltext
searches, they can improve the format.  They can even introduce
a new index file type .MYF solely for optimizing fulltext searches.

None of this need have any impact on the data file format or the 
SQL search optimizations, and yet it could still improve the search
speed for fulltext.  It might not help as much for the slow indexing,
but it could certainly improve the performance of the search.

Thinking out loud...
Steve


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <mysql-thread98929@stripped>
To unsubscribe, e-mail <mysql-unsubscribe-cyon=bestweb.net@stripped>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php

Thread
Distributed Fulltext?Brian DeFeyter12 Feb
Re: Distributed Fulltext?Steve Rapaport12 Feb
Re: Distributed Fulltext?Brian DeFeyter12 Feb
Re: Distributed Fulltext?Steve Rapaport12 Feb
Re: Distributed Fulltext?Brian DeFeyter12 Feb
Re: Distributed Fulltext?Alex Aulbach12 Feb
Re: Distributed Fulltext?James Montebello12 Feb
Re: Distributed Fulltext?George M. Ellenburg12 Feb
Re: Distributed Fulltext?Steve Rapaport12 Feb
  • Re: Distributed Fulltext?Mike Wexler12 Feb
Re: Distributed Fulltext?alec.cawley12 Feb
Re: Distributed Fulltext?Steve Rapaport12 Feb
  • Re: Distributed Fulltext?Mike Wexler13 Feb
    • Re: Distributed Fulltext?hooker14 Feb
Re: Distributed Fulltext?Steve Rapaport12 Feb