List: General Discussion « Previous MessageNext Message » From: Joerg Bruehe Date: May 15 2009 8:30am Subject: Re: selecting the 'best' match View as plain text
```Hi!

blackwater dev wrote:
> I have a hold car data such as color, model, make, year, etc.  I want to
> allow the user to answer some questions and I'll present them with the car
> that 'best' matches their criteria.  How do I do this?  I still want to
> return ones that don't match exactly but want the closer matches ordered at
> the top:
>
> Table:cars
>
> columns: car_id, make, model, year, color, condition
>
> So if the user enterrs:
>
> model: Toyota
> year: 1998
> condition:great
> color: blue
>
> I would show them a blue 1998 good conditioned camry first but farther down
> in the list might still have a blue good condition 98 Honda.

Returning records in some specified order requires to sort them, and for
this you need some criteria.
In most cases, these are values of these records (like a name or size),
but this doesn't match your application.

You want to order by a "distance": How much deviates this record from
the perfect match?

Mathematically, your records are points in some n-dimensional space
(dimensions being "model", "year", "condition", "color", probably
several others), and your "perfect" match is one specific such point.

If all your dimensions were numerical (like year) or at least ordered
(like condition: broken, poor, average, good, great), you could define a
distance within each dimension, and then combine these to n-dimensional.
Comparison to geometry: When you know the distances in x, y, and z
dimension, you can use Pythagoras' formula:
dist = sqrt ( (x1 - x)**2 + (y1 - y)**2 + (z1 - z)**2 )

But with non-numerical and even un-ordered data, your choices are pretty
arbitrary:
What is the "distance" between colors?
Which one is "closer" to a Toyota - a Ford or a Volkswagen?
So I doubt you can define a "closeness" (or "distance") function.

In your case, I propose to try UNION:
SELECT ... WHERE model = UserModel AND color = UserColor
ORDER BY condition, ABS (year - UserYear)
UNION
SELECT ... WHERE model = UserModel AND color != UserColor
ORDER BY condition, ABS (year - UserYear)
UNION
SELECT ... WHERE model = !UserModel AND color = UserColor
ORDER BY condition, ABS (year - UserYear)
UNION
SELECT ... WHERE model = !UserModel AND color = !UserColor
ORDER BY condition, ABS (year - UserYear)

This is based on the assumption that the exact match on the model is
more important than on the color.
And if you think a mismatch in both model and color makes the car
irrelevant, drop the last SELECT in the UNION.

But as soon as the potential buyer says "doesn't matter", you would need
a different SELECT (one that doesn't differ by that field), so the whole
thing calls for a program that generates the whole SQL statement based
on the user input.

HTH,
Jörg

--
Joerg Bruehe,  MySQL Build Team,  Joerg.Bruehe@stripped
(+49 30) 417 01 487
Sun Microsystems GmbH,   Komturstraße 18a,   D-12099 Berlin
Geschaeftsfuehrer: Thomas Schroeder, Wolfgang Engels, Wolf Frenkel
Vorsitzender des Aufsichtsrates: Martin Haering     Muenchen: HRB161028

```