>>>>> "Stephen" == Stephen Tu <stephentu@stripped> writes:
Stephen> Please bear with me as this is going to be kind of a long post.
Stephen> I have a query that is like (call it query A):
Stephen> SELECT fieldA, agg(fieldB) FROM some_tbl WHERE fieldC <= [some
Stephen> constant] GROUP BY fieldA;
Stephen> Where both fieldA, fieldB, and fieldC are *not* part of any index on the
Stephen> table, and where agg() is some aggregate UDF.
Stephen> Therefore, I took the source code for mysql-5.5.21, and in
Stephen> sql/sql_select.cc, I commented out the following lines:
Stephen> 15278 //if (!sum_item->quick_group)
Stephen> 15279 // param->quick_group=0; // UDF SUM function
Stephen> Upon re-compilation, EXPLAIN now tells me for query A that it is doing
Stephen> the following: "Using where; Using temporary table; Using filesort".
Stephen> Great, I thought, that was easy! However, now when I execute query A,
Stephen> agg(fieldB) simply returns NULL. I verified that my UDF isn't even being
Stephen> called. Presumably, there is a reason why UDFs don't go down this code
Stephen> path :)
Stephen> As I am not very familiar with the mysql internals, I would like to know
Stephen> how I can actually get this to work. Is this even possible?
The reason for the above is the following:
MySQL has two ways to calculate aggreagates:
- Sort things according to the items in the GROUP BY part
- For each row
- call add() method for all aggregated items
- When the GROUP BY changes value, send the current calculated values
to the next stage (normally the client) and call the clear() method
on all aggregate items.
This method is is the one that is supported with the aggregate UDF
- Create a temporary table to hold one result row. (For each aggregate
item we use the method create_temp_field() to create a field)
- For each row
- Generate the GROUP BY KEY
- Search for a row in the temporary table matching that key
- If row didn't exists
- Call reset_field() for all aggregate items
- Write a row with the group-by key and aggregated item values to
the temporary table
- If row existed
- call update_field() for all aggregate fields
- Update row in temporary table
- Sort temporary table in group-by order and send result to client.
The reason UDF:s can't use b) is that the UDF interface doesn't
support the reset_field(), update_field(), create_temp_field() and
(This was done to keep the udf interface simple).
If you have a high desire for speed, then you should consider creating
a native aggregate function in MariaDB/MySQL.
Check the files item_sum.h and item_sum.cc. You should be able to use
solve your problem relatively easily by inherit from some of the
already existing items.
Creator of MySQL and MariaDB