List:General Discussion« Previous MessageNext Message »
From:Chris Date:March 19 1999 3:27pm
Subject:Efficient multiword searching?
View as plain text  
Hello All! I'm somewhat new to mysql and am writing my first big app. I'm
writing a web frontend for a db that needs to allow searching through the
db records. I've written the following perl func to do the searching and
was wondering if I could have chosen a more efficient way to do this. A
record will match a search string if any of the words in the search string
appear in the record, or if the search string appears in it's entirety in
the record.

The basic algorithm is this:

1. Create an SQL statement for each search word.
2. Execute each SQL statement, storing the returned record id's in a
   perl hash. (Use a hash so that duplicate matches only get stored
   once)
3. Create a comma-delimited 'set' of matched id's.
4. Create an SQL query to return all records which have an id that
   is in the 'set' created in step 3.

I've edited this somewhat for brevity, field1, field2, and field3 have
been left in where the database actually uses about 10 fields in each row.

INFO:

t1 contains the records to be searched
t2 contains information on whether the record is to be searched
t3 also contains information on whether the record is to be searched

This all seems a little excessive to me... Can anyone suggest a better
method for searching?

Thanks!

--Chris

--------------------------

sub search {

    my $text     = $_form->param('search_text') || '00';
    my $q_text   = $_dbh->quote("%$text%");

    my $q_did    = $_dbh->quote($_form->param('did'));

    $_dbh->do("LOCK TABLES t1 READ, t2 READ, t3 READ");

    ###############################################
    # Search for entries which contain the given  #
    # search text. Select only 'entry_id' so that #
    # the next search can eliminate duplicates.   #
    ###############################################

    my (@SQL, %id_list);

    # This creates an array containing each of the words in
    # the search string, as well as the entire search string.

    my @words = &create_search_words($text);

    # Create an SQL statement for each search word

    foreach my $my_word (@words) {

        my $q_text = $_dbh->quote("%$my_word%");

        my $buf = "SELECT t1.entry_id
                   FROM   t1, t2, t3
                   WHERE  t1.dest_id != 0
                   AND    t1.rate_id != 0
                   AND    t2.id = t1.rate_id
                   AND    t3.id = t1.dest_id
                   AND    t2.use_stats like '%V%'
                   AND    t3.use_stats = 'Y'
                   AND    t1.dest_id = $q_did

                   AND(   t1.field1 like $q_text
                     OR   t1.field2 like $q_text
                     OR   t1.field3 like $q_text)

                   LIMIT 50";

        push(@SQL, $buf);

    }

    # Execute each search statement, storing the id in
    # a perl hash (Duplicates dissapear)

    while(@SQL) {

        my $entry_id;
        my $sth = $_dbh->prepare(pop(@SQL));

        $sth->execute;
        $sth->bind_col(1, \$entry_id);

        while($sth->fetch()) {
            $id_list{$entry_id} = 'true';
        }

        $sth->finish();

    }

    #################################################
    # Now select the required data from the entries #
    # corresponding to the sorted list of unique    #
    # entry ids.                                    #
    #################################################

    my ($field1, $field2, $field3);

    my $set = $_dbh->quote(join(',', keys(%id_list)));

    my $sth = $_dbh->prepare("SELECT    t1.field1,
                                        t1.field2,
                                        t1.field3
                               FROM     t1
                               WHERE    FIND_IN_SET(t1.entry_id, $set) !=
0");

    $sth->execute;
    $sth->bind_columns(undef, \$field1, \$field2, \$field3);


    while($sth->fetch()) {
        # Do something with the retrieved data
    }

    $sth->finish();

    $_dbh->do("UNLOCK TABLES");

}


Thread
Efficient multiword searching?Chris19 Mar