I am trying to speed up the full text searching on my MySQL database. I currently have one
table that has approximately 90,000 rows added to it each 24 hour period in batches of up
to 200 rows constantly. These rows are will be kept for approximately 60 days, however I
am currently working with test data consisting of about 7 days worth of data (630,000
The fulltext index is over two fields and I have 7 other indexes, each over a single
field. Queries are always done over at least the fulltext fields and one other indexed
field. I am using MySQL verison 4.0.17-standard-log (binary from mysql.com) on Red Hat
9.0 Linux 2.4.20-16.9 #1 on an Intel 2.6GHz with 2GB RAM. I have read a considerable
amount of the MySQL book and searched on the FAQ lists and on google for some ideas and
here's what I have tried to speed up the searches so far....
Initially test searches would return a matching result set of 7,000 rows in approximately
12 secs (longer if inserts are being made on the table). The test query looks like:
SELECT DISTINCT indexed_field_1 FROM table WHERE (MATCH
(fulltext_field_one,fulltext_field_two) AGAINST ('test words' IN BOOLEAN MODE)) AND
indexed_field_2 = 'aword' ORDER BY indexed_field_3 DESC
Firstly I tried allocating more memory to the MySQL server according to the advice in the
MySQL book, however this did not improve the response times. I then tried putting a LIMIT
10 clause in the statement as the user only wants to look at a small range of the results
(in my testing 10 rows). This improved the query time to about 6 secs.
I then talked to another person using a similar sized database with similar indexes and
queries and they were managing to get results back in 2 - 3 secs. I looked at their
queries and saw that they queries lack an ORDER BY clause, i.e.
SELECT DISTINCT field_1 FROM table WHERE (MATCH (fulltext_field_one,fulltext_field_two)
AGAINST ('test words' IN BOOLEAN MODE)) AND field_2 = 'aword' LIMIT 10
so I tried this on my database.... and my queries came back in under 1 second. This was
great, and I am guessing is down to the lack of sorting required by the ORDER BY clause.
I then tried to think of a way to store the data in such a way as an ORDER BY would not
be needed and so I tried storing the data with the PRIMARY KEY as the field I always
order by. (btw.. the field is a date field). I had read on the MySQL site that the
PRIMARY KEY is a clustered key and therefore the data rows are stored in the order of the
After recreating the table with the ORDER BY field now as the PRIMARY KEY I tried the
query again with out the ORDER BY clause but the rows seem to come out in the order that
they were inserted into the table. This means that I am always getting the earliest
inserted 10 rows that match the query rather than the latest inserted 10 rows which is
what my users want. Maybe I misread/misunderstood the text I found on the MySQL site
about the clustered index/primary key field.
If anyone has any ideas on how I can try to avoid the need of the ORDER BY clause, yet
always get the data results ordered in a certain manner ( I know that this seems like a
contradiction! ) by doing something with the way the data set is stored, or in fact any
other way possible that would be great. Other comments/ideas on where I may be going
wrong or other options I have not appeared to have considered would be great as well.
Thanks in advance for all of you who take the time to read this and those who are kind
enough to reply.
|• Speeding up queries by avoiding ORDER BY clauses||Tom Hibbert||12 Jan|