>From: Johan De Meersman [mailto:vegivamp@stripped]
>Sent: Tuesday, May 03, 2011 5:31 AM
>To: Jerry Schwartz
>Cc: Jim McNeely; mysql mailing list; Johan De Meersman
>Subject: Re: Join based upon LIKE
>http://www.gedpage.com/soundex.html offers a simple explanation of what it
>One possibility would be building a referential table with only a recordID
>soundex column, unique over both; and filling that with the soundex of
>individual nonjunk words.
>So, from the titles
>1 | Rain in Spain
>2 | Spain's Rain
>1 | R500
>1 | S150
>2 | S150
>2 | R500
>From thereon, you can see that all the same words have been used - ignoring a
>lot of spelling errors like Spian. Obviously not a magic solution, but it's a
I'm not sure that I could easily build a dictionary of non-junk words, since
some of these reports have titles like "Toluene Diisocyanate Market Outlook
2008", "Toluene Market Outlook 2008", and "Toluene: 2009 World Market Outlook
And Forecast (Special Crisis Edition)".
I shall ponder this when I am caught up, or (more likely) in the afterlife.
Global Information Incorporated
195 Farmington Ave.
Farmington, CT 06032
860.674.8796 / FAX: 860.674.8341
Web site: www.the-infoshop.com
>----- Original Message -----
>> From: "Jerry Schwartz" <jerry@stripped>
>> To: "Johan De Meersman" <vegivamp@stripped>
>> Cc: "Jim McNeely" <jim@stripped>, "mysql mailing list"
>> Sent: Monday, 2 May, 2011 4:09:36 PM
>> Subject: RE: Join based upon LIKE
>> [JS] I've thought about using soundex(), but I'm not quite sure how.
>> I didn't pursue it much because there are so many odd terms such as
>> names, but perhaps I should give it a try in my infinite free time.
>> [JS] Thanks for your condolences.
>> Jerry Schwartz
>> Global Information Incorporated
>> 195 Farmington Ave.
>> Farmington, CT 06032
>> 860.674.8796 / FAX: 860.674.8341
>> E-mail: jerry@stripped
>> Web site: www.the-infoshop.com
>Bier met grenadyn
>Is als mosterd by den wyn
>Sy die't drinkt, is eene kwezel
>Hy die't drinkt, is ras een ezel