From: Chaithra Gopalareddy Date: February 23 2012 10:12am Subject: bzr push into mysql-trunk branch (chaithra.gopalareddy:3945 to 3946) Bug#11829861 List-Archive: http://lists.mysql.com/commits/143040 X-Bug: 11829861 Message-Id: <201202231012.q1NACIjX003447@acsmt357.oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit 3946 Chaithra Gopalareddy 2012-02-23 Bug#11829861 - SUBSTRING_INDEX() RESULTS "OMIT" CHARACTER WHEN USED INSIDE LOWER() PROBLEM Output of the function substring_index would have missing characters when used with string conversion functions like lower(). Ex: SET @user_at_host = 'root@stripped'; SELECT LOWER(SUBSTRING_INDEX(@user_at_host, '@', -1)); mytinyhost-pc. ocal ANALYSIS: In the function Item_func_substr_index::val_str(), the final evaluated string(Item_func_substr_index::tmp_value) is marked as constant after the first evaluation. (The reason for the same is mentioned in Bug#14676). Once evaluated, we try to convert this string to lower case. While doing so, we call the function "copy_if_not_alloced". This function does a copy or allocation, based on the "alloced length"s of the strings passed. Since, "tmp_value" is marked as constant, "Alloced length" for that string becomes zero, thereby forcing allocation and then a subsequent copy which results in the missing space. What we need to note here is that, the source string(tmp_value) for the function "copy_if_not_alloced" would be pointing to an address inside the destination string, which is the original string. Hence the missing letters. Code Snippets: Item_str_conv::val_str(str)//conversion to lower case { res=Item_func_substr_index::val_str(str) (res is actully pointing to an address inside str) res= copy_if_not_alloced(str,res,res->length()); } copy_if_not_alloced(to,from,from_length) { if (to->realloc(from_length)) return from; // Actually an error if ((to->str_length=min(from->str_length,from_length))) memcpy(to->Ptr,from->Ptr,to->str_length); } If we do not, mark the "tmp_value" as const, we would have returned from "copy_if_not_alloced" much earlier, avoiding the overwriting. So the fix is to "not mark tmp_value as const", as there is no need for it.As for the fix for the bug#14676, we fix it by allocating a temporary buffer to get the delimiter. As, we were using "tmp_value" to get the delimiter and also to return the evaluated string, we were seeing the problem. Also, there is one more bug present in this function associated with bug#42404.substring_index function returns inconsistent results when delimiter is present at offset "0" while the count is negative and greater than the number of times the delimiter is present in the string. Currently, if the delimiter is present at offset "0", we skip setting of "tmp_value"(this contains final evaluated string), instead return the previously set "tmp_value". This was reason for the inconsistent results stated in the problem description. With this fix, we return the original string if the count is non-zero at the end of the loop. @ mysql-test/r/func_str.result Result file changes for the test cases added @ mysql-test/t/func_str.test Added test cases for Bug#11829861 and Bug#42404 @ sql/item_strfunc.cc Changed substring_index::val_str() to use a different buffer to get the delimiter and to also take "offset 0" into consideration while searching for delimiter. modified: mysql-test/r/func_str.result mysql-test/t/func_str.test sql/item_strfunc.cc 3945 Hemant Kumar 2012-02-23 Disabling events_restart @windows as The test started failing on windows after the fix for bug#11748899 and followup patch didnot work. modified: mysql-test/t/disabled.def === modified file 'mysql-test/r/func_str.result' --- a/mysql-test/r/func_str.result 2012-01-25 15:49:57 +0000 +++ b/mysql-test/r/func_str.result 2012-02-23 10:08:33 +0000 @@ -119,7 +119,7 @@ substring_index('aaaaaaaaa1','aaa',-3) aaaaaa1 select substring_index('aaaaaaaaa1','aaa',-4); substring_index('aaaaaaaaa1','aaa',-4) - +aaaaaaaaa1 select substring_index('the king of thethe hill','the',-2); substring_index('the king of thethe hill','the',-2) the hill @@ -4464,5 +4464,28 @@ EXECUTE stmt; COLLATION(space(2)) latin2_general_ci # +# Bug#11829861: SUBSTRING_INDEX() RESULTS IN MISSING CHARACTERS WHEN USED +# INSIDE LOWER() +# +SET @user_at_host = 'root@stripped'; +SELECT LOWER(SUBSTRING_INDEX(@user_at_host, '@', -1)); +LOWER(SUBSTRING_INDEX(@user_at_host, '@', -1)) +mytinyhost-pc.local +# End of test BUG#11829861 +# +# Bug#42404: SUBSTRING_INDEX() RESULTS ARE INCONSISTENT +# +CREATE TABLE t (i INT NOT NULL, c CHAR(255) NOT NULL); +INSERT INTO t VALUES (0,'.www.mysql.com'),(1,'.wwwmysqlcom'); +SELECT i, SUBSTRING_INDEX(c, '.', -2) FROM t WHERE i = 1; +i SUBSTRING_INDEX(c, '.', -2) +1 .wwwmysqlcom +SELECT i, SUBSTRING_INDEX(c, '.', -2) FROM t; +i SUBSTRING_INDEX(c, '.', -2) +0 mysql.com +1 .wwwmysqlcom +DROP TABLE t; +# End of test BUG#42404 +# # End of 5.6 tests # === modified file 'mysql-test/t/func_str.test' --- a/mysql-test/t/func_str.test 2012-01-25 15:49:57 +0000 +++ b/mysql-test/t/func_str.test 2012-02-23 10:08:33 +0000 @@ -1688,5 +1688,28 @@ SET NAMES latin2; EXECUTE stmt; --echo # +--echo # Bug#11829861: SUBSTRING_INDEX() RESULTS IN MISSING CHARACTERS WHEN USED +--echo # INSIDE LOWER() +--echo # + +SET @user_at_host = 'root@stripped'; +SELECT LOWER(SUBSTRING_INDEX(@user_at_host, '@', -1)); + +--echo # End of test BUG#11829861 + +--echo # +--echo # Bug#42404: SUBSTRING_INDEX() RESULTS ARE INCONSISTENT +--echo # + +CREATE TABLE t (i INT NOT NULL, c CHAR(255) NOT NULL); +INSERT INTO t VALUES (0,'.www.mysql.com'),(1,'.wwwmysqlcom'); +SELECT i, SUBSTRING_INDEX(c, '.', -2) FROM t WHERE i = 1; +SELECT i, SUBSTRING_INDEX(c, '.', -2) FROM t; + +DROP TABLE t; + +--echo # End of test BUG#42404 + +--echo # --echo # End of 5.6 tests --echo # === modified file 'sql/item_strfunc.cc' --- a/sql/item_strfunc.cc 2012-02-22 08:57:27 +0000 +++ b/sql/item_strfunc.cc 2012-02-23 10:08:33 +0000 @@ -1566,10 +1566,12 @@ void Item_func_substr_index::fix_length_ String *Item_func_substr_index::val_str(String *str) { DBUG_ASSERT(fixed == 1); + char buff[MAX_FIELD_WIDTH]; + String tmp(buff,sizeof(buff),system_charset_info); String *res= args[0]->val_str(str); - String *delimiter= args[1]->val_str(&tmp_value); + String *delimiter= args[1]->val_str(&tmp); int32 count= (int32) args[2]->val_int(); - uint offset; + int offset; if (args[0]->null_value || args[1]->null_value || args[2]->null_value) { // string and/or delim are null @@ -1640,7 +1642,7 @@ String *Item_func_substr_index::val_str( { // start counting from the beginning for (offset=0; ; offset+= delimiter_length) { - if ((int) (offset= res->strstr(*delimiter, offset)) < 0) + if ((offset= res->strstr(*delimiter, offset)) < 0) return res; // Didn't find, return org string if (!--count) { @@ -1654,14 +1656,14 @@ String *Item_func_substr_index::val_str( /* Negative index, start counting at the end */ - for (offset=res->length(); offset ;) + for (offset=res->length(); offset; ) { /* this call will result in finding the position pointing to one address space less than where the found substring is located in res */ - if ((int) (offset= res->strrstr(*delimiter, offset)) < 0) + if ((offset= res->strrstr(*delimiter, offset)) < 0) return res; // Didn't find, return org string /* At this point, we've searched for the substring @@ -1674,14 +1676,10 @@ String *Item_func_substr_index::val_str( break; } } + if (count) + return res; // Didn't find, return org string } } - /* - We always mark tmp_value as const so that if val_str() is called again - on this object, we don't disrupt the contents of tmp_value when it was - derived from another String. - */ - tmp_value.mark_as_const(); return (&tmp_value); } No bundle (reason: useless for push emails).