List:General Discussion« Previous MessageNext Message »
From:Michael Widenius Date:November 12 1999 11:57pm
Subject:Re: Fastest MySQL version?
View as plain text  
>>>>> "jpabuyer" == jpabuyer  <jpabuyer@stripped> writes:

>> very bad way. There are also people who have tables with millions of
>> records and have no performance problems at all.
>> 
>> If you have problem, then post DB structures and query here and we try
>> to help yo find better solution.   

jpabuyer> Well :).. it wasn't me who began this, but I might get benefited from this
jpabuyer> :).. ok... here is it:

jpabuyer> This is one of the most common queries here:

jpabuyer> select main.id_main, main.titulo, main.descrip, main.url, main.ts,
jpabuyer> main.hits, paises.nombre, idiomas.nombre, categorias.nombre
jpabuyer> from   main, paises, idiomas, categorias
jpabuyer> where  main.id_cat=categorias.id_cat and
jpabuyer> main.id_pais=paises.id_pais and 
jpabuyer> main.id_idioma=idiomas.id_idioma and
jpabuyer> (main.descrip like '%word1%' or main.titulo like '%word1%' ) and
jpabuyer> (main.descrip like '%word2%' or main.titulo like '%word2%' ) and
jpabuyer> (main.descrip like '%word3%' or main.titulo like '%word3%' ) and
jpabuyer> (main.descrip like '%word4%' or main.titulo like '%word4%' ) and
jpabuyer> paises.id_pais=4 and
jpabuyer> idiomas.id_idioma=3 and
jpabuyer> categorias.id_cat=87
jpabuyer> order by main.ts DESC limit 1000

jpabuyer> Explain says:

jpabuyer>
> +------------+-------+---------------+---------+---------+------+--------+------------+
jpabuyer> | table      | type  | possible_keys | key     | key_len | ref  | rows  
> |Extra      |
jpabuyer>
> +------------+-------+---------------+---------+---------+------+--------+------------+
jpabuyer> | paises     | const | PRIMARY       | PRIMARY |       2 | ???  |      1 |   
>        |
jpabuyer> | idiomas    | const | PRIMARY       | PRIMARY |       2 | ???  |      1 |   
>        |
jpabuyer> | categorias | const | PRIMARY       | PRIMARY |       2 | ???  |      1 |   
>        |
jpabuyer> | main       | ALL   | NULL          | NULL    |    NULL | NULL | 268628
> |where used |
jpabuyer>
> +------------+-------+---------------+---------+---------+------+--------+------------+
jpabuyer> 4 rows in set (0.00 sec)

This will be a bit faster if you add an index on

index (main.id_cat, main.id_pais, main.id_idioma)


jpabuyer> main table has about 270000 rows, and time response is typically 5.3 secs,
jpabuyer> and as main table grows response time also imcrements. So, as main table
jpabuyer> is supposed to grow to several millions of rows I'm worried about response
jpabuyer> time. IMHO, the problem are those "like '%word%'". So I'm working on this
jpabuyer> idea: I created a special table named 'palabra' which looks like this:

mysql> show columns from palabra;
jpabuyer> +---------+------------------+------+-----+---------+----------------+
jpabuyer> | Field   | Type             | Null | Key | Default | Extra          |
jpabuyer> +---------+------------------+------+-----+---------+----------------+
jpabuyer> | id_pal  | int(10) unsigned |      | PRI | 0       | auto_increment |
jpabuyer> | palabra | varchar(255)     |      | MUL |         |                |
jpabuyer> | id_main | int(11)          |      |     | 0       |                |
jpabuyer> +---------+------------------+------+-----+---------+----------------+
jpabuyer> 3 rows in set (0.00 sec)

jpabuyer> This table is supposed to be an index for main.descrip :-).. I mean, I
jpabuyer> splitted every word contained on main.descrip into this new table, so that
jpabuyer> I could have a kind of an index for words on a varchar field. What I gain
jpabuyer> doing this is _very_ much faster "like 'word%'" searches, but as manual
jpabuyer> says, I'm not getting any benefit for "like '%word%'" searches. And with
jpabuyer> this new table I can't still search for more than one "like 'word%"
jpabuyer> sentence, since what I really need to do is get the information from main
jpabuyer> table. For example, if I would use more than one "like" I would have
jpabuyer> something like this:

mysql> select * from palabra where (palabra like 'okidata%' or palabra
jpabuyer> like 'verbatim%');
jpabuyer> +---------+----------+---------+
jpabuyer> | id_pal  | palabra  | id_main |
jpabuyer> +---------+----------+---------+
jpabuyer> |  793742 | Okidata  |   33669 |
jpabuyer> |  793762 | Okidata  |   33669 |
jpabuyer> | 1428484 | Okidata  |   41519 |
jpabuyer> | 3515875 | OKIDATA  |  144848 |
jpabuyer> | 4241917 | Okidata  |  172366 |
jpabuyer> | 4244272 | OKIDATA  |  172452 |
jpabuyer> |  967602 | Verbatim |   41519 |
jpabuyer> | 3243793 | verbatim |  134972 |
jpabuyer> +---------+----------+---------+
jpabuyer> 8 rows in set (0.30 sec)

jpabuyer> What I really need to know from there is _which_ palabra.id_main has
jpabuyer> palabra.palabra like 'okidata%' _AND_ also palabra.palabra like
jpabuyer> 'verbatim%'. The answer would be id_main=41519, but is there a way to do
jpabuyer> that with a SQL query?

You can always use this hack:

select * from palabra where palabra like 'okidata%' and main

or palabra like 'verbatim%');

select id_main from palabra, main where palabra.id_main=main.id_main
and palabra.palabra like 'okidata%' and main.descrip like "%verbatim%";

(In other words, search after one word in palabra, and then check for
all words in the document)

Another option:

select id_main,count(*) as cnt from palabra where (palabra like
'okidata%' or palabra like 'verbatim%') group by id_main having cnt=2;

(For this to be fast, you should use MySQL 3.23)

Regards,
Monty
Thread
Fastest MySQL version?Ed Carp23 Oct
  • Re: Fastest MySQL version?Tonu Samuel23 Oct
    • Re: Fastest MySQL version?jpabuyer23 Oct
      • Re: Fastest MySQL version?Michael Widenius13 Nov