Hi!
On Apr 16, JJ Tavernier wrote:
> Thanks a bunch.
>
> >
> > As I wrote, you cannot return many rows from a function.
> > What you can do - change the syntax to have a function that returns only
> > one value, as a regular function. E.g. a function that returns TRUE if the
> > row is kNN and FALSE otherwise. Then you can query it as
> >
> > SELECT * FROM geom2 WHERE k_nearest_neighbor(geom_column, @g, 3)
> >
> > Or make it a function that returns a rank of the neighbor, then it can
> > be queried as
> >
> > SELECT * FROM geom2 WHERE k_nearest_neighbor(geom_column, @g) <= 3
> >
> > In any case, the syntax will be the smallest of your problems :)
> > Adding necessary access methods to R-tree and the handler interface,
> > putting knowledge about kNN searches into the optimizer is going to be
> > more difficult.
>
> Yeah, I'm starting to understand that there is not going to be a
> simple solution to this:)
> I've started reading through some of the internals documentation, and
> looking through the rtree and select statement code.
>
> As I see it, here are the high level modifications necessary:
> 1) rt_index.c: add the necessary logic to traverse the rtree index
> from the root, perform the kNN search along the way
Yes
> 2a) udf_knn.cpp: this is the definition of the k_nearest_neighbor
> function. Somehow, when this is called, it needs to get access to the
> spatial index (is this the MI_INFO structure?) for the geom_column for
> the geom2 table. The index is needed so that I can pass it to the kNN
> methods in rt_index.c. I could perform the kNN search when the init
> function is called, then return 1 in the xxx() function if the
> geom_column key is nearest neighbor k or less. Here are some
> questions:
> - Is it possible to get ahold of the spatial index in a UDF?
No.
> - How do I go about limiting the number of times that
> k_nearest_neighbor function is called (so that a full table scan isn't
> performed)? I suppose this will involve the optimizer.
>
> OR
>
> 2b) If there's no way to have a UDF get access to the spatial index on
Right, no way. UDF interface is too limited.
> a column, I'll have to make the internals aware of the kNN
> functionality. Somehow, the fact that the user wants the kNN needs to
> be retained during processing, which suggests that the parser needs to
> understand k_nearest_neighbor as a new operator. Any ideas on where I
> should look to add this? If we simplify and say that k == 1 (i.e., we
> are only interested in the nearest neighbor), then I think that a
> function similar to mi_rkey (in mi_rkey.c) might be sufficient (maybe
> this could be called mi_rknnkey). How would I go about making the
> parser aware that this new method needs to be called rather than
> mi_rkey?
Let's no simplify to k==1 case - it's too simple. That is it suggests
solutions you cannot generalize for k>1 case.
So what you may need to do is:
1. Add the necessary logic to rt_index.c (that is in MyISAM)
2. Extend handler API to support kNN search, so that SQL layer could ask
MyISAM - via handler API - to do such a search.
* it's basically as simple as adding new value to enum ha_rkey_function
3. add support for kNN searches to optimizer - so that it could know
when to use r-tree index.
* see, for example, how other spatial functions do to.
That's all :)
Regards,
Sergei
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, Senior Software Developer
/_/ /_/\_, /___/\___\_\___/ Kerpen, Germany
<___/ www.mysql.com