List:General Discussion« Previous MessageNext Message »
From:Gavin Towey Date:August 3 2009 6:05pm
Subject:RE: Table advice.
View as plain text  
Another trick is  to reverse the domain when you store it.  For example instead of storing
www.facebook.com, store it as com.facebook.www.  That way you write a query like:

WHERE domain LIKE 'com.facebook%'

This will use the index, since you're not using a wildcard at the beginning of the string,
and the results will include all subdomains for the given domain you're looking for.

Regards,
Gavin Towey

-----Original Message-----
From: Alexander Kolesen [mailto:kolesen_a@stripped]
Sent: Saturday, August 01, 2009 11:10 AM
To: mysql@stripped
Subject: Re: Table advice.

Hello.
Your query performs a full table scan, because if you match text with '%...' wildcard,
MySQL can't using index. Try to use external full-text
search engines like Sphinx (http://www.sphinxsearch.com/) or Lucene
(http://lucene.apache.org).
> I have a database that I am (will) be using to track URL's. The table
> structure looks like this:
>
> CREATE TABLE event
> (
>   eid           INT UNSIGNED NOT NULL AUTO_INCREMENT,
>   timestamp     INT(10) UNSIGNED NOT NULL DEFAULT 0,
>   ip            INT(10) UNSIGNED NOT NULL DEFAULT 0,
>   fqdn          VARCHAR(255),
>   domain        VARCHAR(63),
>   tld           VARCHAR(63),
>   action        VARCHAR(4),
>   request       TEXT,
>   referrer      TEXT,
>   client        VARCHAR(255),
>   INDEX eid (eid),
>   INDEX timestamp (timestamp),
>   INDEX ip (ip),
>   INDEX fqdn (fqdn),
>   INDEX domain (domain),
>   INDEX tld (tld)
> );
>
> The is no real logic behind the indexes, the table was hobbled
> together looking at examples. Currently I am trying queries on about
> 300 million records and the results are pretty crappy. for example, a
> query like this:
>
> select domain,count(domain) as count from event where domain like
> '%facebook%' group by domain order by count desc;
>
> takes about 5 minutes to complete.
>
> Most of the queries will be like that above but probably with
> additional filters like date constraints or IP constraints or a
> mixture of both. I can also see searches through the requests for
> filetypes etc.
>
> Any suggestions or comments would be appreciated.
>
> Thanks.
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:    http://lists.mysql.com/mysql?unsub=1

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:    http://lists.mysql.com/mysql?unsub=1


The information contained in this transmission may contain privileged and confidential
information. It is intended only for the use of the person(s) named above. If you are not
the intended recipient, you are hereby notified that any review, dissemination,
distribution or duplication of this communication is strictly prohibited. If you are not
the intended recipient, please contact the sender by reply email and destroy all copies
of the original message.
Thread
Table advice.Paul Halliday1 Aug
  • Re: Table advice.Alexander Kolesen1 Aug
    • RE: Table advice.Gavin Towey3 Aug