List:General Discussion« Previous MessageNext Message »
From:Alex Krohn Date:January 23 2001 9:12pm
Subject:fulltext search performance
View as plain text  
Hi,

Has anyone used the fulltext indexes in 3.23 and can share their
experiences? I've only done limited testing, but was quite disapointed
with performance.

We have an import of DMoz data into a table with Title, URL and
Description fields as a fulltext index. There are 2.5 million rows that
takes up about 600 megs of data.

A search ordered by relevancy takes about 10 seconds which isn't
acceptable:

mysql> select ID, Title, match(Title,URL,Description) AGAINST ('gossamer threads') as x
>         
mysql> from Links having x order by x desc limit 5;
+--------+-----------------------------+-----------------+
| ID     | Title                       | x               |
+--------+-----------------------------+-----------------+
| 534518 | Gossamer Threads            | 26.504848142061 |
| 529654 | Gossamer Threads            | 25.362285817498 |
| 347215 | Gossamer Threads            | 24.619930445728 |
| 303514 | Gossamer Threads Scripts    | 20.181361130953 |
| 223327 | The Gossamer Project- Fluky | 17.176138340563 |
+--------+-----------------------------+-----------------+
5 rows in set (10.05 sec)

Worse still, it doesn't cache well, and it always takes 10 seconds.

If I turn off the order by it works well, the first hit is slow, but
future searches are < 1 second. However, as you can see the results are
pretty poor.

mysql> select ID, Title, match(Title,URL,Description) AGAINST ('gossamer threads') as x
> 
mysql> from Links having x limit 5;     
+-------+-------------------------------------------+-----------------+
| ID    | Title                                     | x               |
+-------+-------------------------------------------+-----------------+
| 78206 | Mussel Man                                | 6.4472288734643 |
| 81368 | The Blacksmiths Compendium                | 5.7507258841122 |
| 81519 | Jill Cater Nixon's The World of Blackwork | 6.5658374638893 |
| 81544 | Caron Collection Threads                  | 13.784135416084 |
| 81646 | Salt &amp; Pepper Cross Stitch            | 6.1446464319279 |
+-------+-------------------------------------------+-----------------+
5 rows in set (0.44 sec)

Is there anything I can be doing to improve this? I'm running 3.23.28
on a dual piii-800 with 1gig of ram and a u160 scsi disk. We have built
our own custom search routines that do full text altavista style
searches using two internal tables:

mysql> describe Links_Word_List;
+-----------+------------------+------+-----+---------+----------------+---------------------------------+
| Field     | Type             | Null | Key | Default | Extra          | Privileges       
              |
+-----------+------------------+------+-----+---------+----------------+---------------------------------+
| Word_ID   | int(10) unsigned |      | PRI | NULL    | auto_increment |
select,insert,update,references |
| Word      | varchar(50)      |      | UNI |         |                |
select,insert,update,references |
| Frequency | int(11)          |      |     | 0       |                |
select,insert,update,references |
+-----------+------------------+------+-----+---------+----------------+---------------------------------+
3 rows in set (0.00 sec)

mysql> describe Links_Score_List;    
+----------+------------------+------+-----+---------+-------+---------------------------------+
| Field    | Type             | Null | Key | Default | Extra | Privileges                 
    |
+----------+------------------+------+-----+---------+-------+---------------------------------+
| Word_ID  | int(10) unsigned |      | MUL | 0       |       |
select,insert,update,references |
| Item_ID  | int(10) unsigned |      | MUL | 0       |       |
select,insert,update,references |
| Score    | int(11)          |      |     | 0       |       |
select,insert,update,references |
| Position | int(11)          |      |     | 0       |       |
select,insert,update,references |
+----------+------------------+------+-----+---------+-------+---------------------------------+
4 rows in set (0.00 sec)

and the same searches come back in less then 3 seconds the first hit and
under 1 second thereafter (ordered by relevancy).

I'd love to hear any comments about this..

Cheers,

Alex
Thread
fulltext search performanceAlex Krohn23 Jan
  • RE: fulltext search performancePaul Rydell23 Jan
  • Re: fulltext search performanceSergei Golubchik24 Jan