List:General Discussion« Previous MessageNext Message »
From:Jerry Schwartz Date:May 3 2011 2:52pm
Subject:RE: Join based upon LIKE
View as plain text  
>-----Original Message-----
>From: Johan De Meersman [mailto:vegivamp@stripped]
>Sent: Tuesday, May 03, 2011 5:31 AM
>To: Jerry Schwartz
>Cc: Jim McNeely; mysql mailing list; Johan De Meersman
>Subject: Re: Join based upon LIKE
>
>
>http://www.gedpage.com/soundex.html offers a simple explanation of what it
>does.
>
>One possibility would be building a referential table with only a recordID 
>and
>soundex column, unique over both; and filling that with the soundex of
>individual nonjunk words.
>
>So, from the titles
>
>1 | Rain in Spain
>2 | Spain's Rain
>
>you'd get
>
>1 | R500
>1 | S150
>2 | S150
>2 | R500
>
>From thereon, you can see that all the same words have been used - ignoring a
>lot of spelling errors like Spian. Obviously not a magic solution, but it's a
>start.
>
[JS] Thanks.

I'm not sure that I could easily build a dictionary of non-junk words, since 
some of these reports have titles like "Toluene Diisocyanate Market Outlook 
2008", "Toluene Market Outlook 2008", and "Toluene: 2009 World Market Outlook 
And Forecast (Special Crisis Edition)".

I shall ponder this when I am caught up, or (more likely) in the afterlife.

Regards,

Jerry Schwartz
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032

860.674.8796 / FAX: 860.674.8341
E-mail: jerry@stripped
Web site: www.the-infoshop.com

>----- Original Message -----
>> From: "Jerry Schwartz" <jerry@stripped>
>> To: "Johan De Meersman" <vegivamp@stripped>
>> Cc: "Jim McNeely" <jim@stripped>, "mysql mailing list"
><mysql@stripped>
>> Sent: Monday, 2 May, 2011 4:09:36 PM
>> Subject: RE: Join based upon LIKE
>>
>> [JS] I've thought about using soundex(), but I'm not quite sure how.
>>
>> I didn't pursue it much because there are so many odd terms such as
>> chemical
>> names, but perhaps I should give it a try in my infinite free time.
>>
>>
>> [JS] Thanks for your condolences.
>>
>> Regards,
>>
>> Jerry Schwartz
>> Global Information Incorporated
>> 195 Farmington Ave.
>> Farmington, CT 06032
>>
>> 860.674.8796 / FAX: 860.674.8341
>> E-mail: jerry@stripped
>> Web site: www.the-infoshop.com
>>
>
>--
>Bier met grenadyn
>Is als mosterd by den wyn
>Sy die't drinkt, is eene kwezel
>Hy die't drinkt, is ras een ezel



Thread
FW: Join based upon LIKEJerry Schwartz28 Apr
  • Re: Join based upon LIKEJohan De Meersman28 Apr
    • RE: Join based upon LIKEJerry Schwartz29 Apr
      • Re: Join based upon LIKEJohan De Meersman29 Apr
        • RE: Join based upon LIKEJerry Schwartz29 Apr
  • Re: FW: Join based upon LIKEhsv30 Apr
RE: Join based upon LIKEJerry Schwartz29 Apr
  • Re: Join based upon LIKEJohan De Meersman1 May
    • RE: Join based upon LIKEJerry Schwartz2 May
      • Re: Join based upon LIKEJohan De Meersman3 May
        • RE: Join based upon LIKEJerry Schwartz3 May
          • Re: Join based upon LIKEJohan De Meersman3 May
            • Re: Join based upon LIKEshawn wilson3 May
              • RE: Join based upon LIKEJerry Schwartz3 May
                • Re: Join based upon LIKENuno Tavares4 May
                  • RE: Join based upon LIKEJerry Schwartz5 May
Re: FW: Join based upon LIKEhsv30 Apr