| List: | General Discussion | « Previous MessageNext Message » | |
| From: | Johan De Meersman | Date: | May 3 2011 2:59pm |
| Subject: | Re: Join based upon LIKE | ||
| View as plain text | |||
----- Original Message ----- > From: "Jerry Schwartz" <jerry@stripped> > > I'm not sure that I could easily build a dictionary of non-junk > words, since The traditional way is to build a database of junk words. The list tends to be shorter :-) Think and/or/it/the/with/like/... Percentages of mutual and non-mutual words between two titles should be a reasonable indicator of likeness. You could conceivably even assign value to individual words, so "polypropylbutanate" is more useful than "synergy" for comparison purposes. All very theoretical, though, I haven't actually done much of it to this level. My experience in data mangling is limited to mostly should-be-fixed-format data like sports results. -- Bier met grenadyn Is als mosterd by den wyn Sy die't drinkt, is eene kwezel Hy die't drinkt, is ras een ezel
| Thread | ||
|---|---|---|
| • FW: Join based upon LIKE | Jerry Schwartz | 28 Apr |
| • Re: Join based upon LIKE | Johan De Meersman | 28 Apr |
| • RE: Join based upon LIKE | Jerry Schwartz | 29 Apr |
| • Re: Join based upon LIKE | Johan De Meersman | 29 Apr |
| • RE: Join based upon LIKE | Jerry Schwartz | 29 Apr |
| • Re: FW: Join based upon LIKE | hsv | 30 Apr |
| • RE: Join based upon LIKE | Jerry Schwartz | 29 Apr |
| • Re: Join based upon LIKE | Johan De Meersman | 1 May |
| • RE: Join based upon LIKE | Jerry Schwartz | 2 May |
| • Re: Join based upon LIKE | Johan De Meersman | 3 May |
| • RE: Join based upon LIKE | Jerry Schwartz | 3 May |
| • Re: Join based upon LIKE | Johan De Meersman | 3 May |
| • Re: Join based upon LIKE | shawn wilson | 3 May |
| • RE: Join based upon LIKE | Jerry Schwartz | 3 May |
| • Re: Join based upon LIKE | Nuno Tavares | 4 May |
| • RE: Join based upon LIKE | Jerry Schwartz | 5 May |
| • Re: FW: Join based upon LIKE | hsv | 30 Apr |
