List:Commits« Previous MessageNext Message »
From:Guilhem Bichot Date:July 27 2010 9:02pm
Subject:Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)
Bug#50489
View as plain text  
Hello Roy,

Roy Lyseng a écrit, Le 27.07.2010 10:24:
> #At file:///home/rl136806/mysql/repo/mysql-work5/ based on
> revid:epotemkin@stripped
> 
>  3219 Roy Lyseng	2010-07-27
>       Bug#50489: another segfault in fix_semijoin_strategies...
>       
>       commit mail test

I built a tree with the same revision as when I filed the bug report, 
was able to reproduce the Valgrind error.
==397== Invalid read of size 4
==397==    at 0x7B390B: 
_ZL16advance_sj_stateP4JOINyPK13st_join_tablejPdS4_P11st_position 
(sql_select.cc:13385)
==397==    by 0x7B5A42: 
_ZL32best_extension_by_limited_searchP4JOINyjddjj (sql_select.cc:7572)
==397==    by 0x7B608C: _ZL13greedy_searchP4JOINyjj (sql_select.cc:7299)
==397==    by 0x7B67DF: _ZL11choose_planP4JOINy (sql_select.cc:6910)
==397==    by 0x7BAF2C: 
_ZL20make_join_statisticsP4JOINP10TABLE_LISTP4ItemP16st_dynamic_array 
(sql_select.cc:4527)
==397==    by 0x7BE111: JOIN::optimize() (sql_select.cc:1621)
==397==    by 0x7C2B1E: mysql_select(THD*, Item***, TABLE_LIST*, 
unsigned, List<Item>&, Item*, unsigned, st_order*, st_order*, Item*, 
st_order*, unsigned long long, select_result*, st_select_lex_unit*, 
st_select_lex*) (sql_select.cc:3123)
==397==    by 0x7C85D2: handle_select(THD*, LEX*, select_result*, 
unsigned long) (sql_select.cc:304)
==397==    by 0x717C37: _ZL21execute_sqlcom_selectP3THDP10TABLE_LIST 
(sql_parse.cc:4940)
==397==    by 0x719FB2: mysql_execute_command(THD*) (sql_parse.cc:2165)
==397==    by 0x923D6C: sp_instr_stmt::exec_core(THD*, unsigned*) 
(sp_head.cc:2922)
==397==    by 0x923F63: sp_lex_keeper::reset_lex_and_exec_core(THD*, 
unsigned*, bool, sp_instr*) (sp_head.cc:2745)
==397==    by 0x92A2C7: sp_instr_stmt::execute(THD*, unsigned*) 
(sp_head.cc:2860)
==397==    by 0x926342: sp_head::execute(THD*) (sp_head.cc:1246)
==397==    by 0x9271B9: sp_head::execute_procedure(THD*, List<Item>*) 
(sp_head.cc:1988)
==397==    by 0x72054E: mysql_execute_command(THD*) (sql_parse.cc:4398)
==397==  Address 0xb706e70 is 40 bytes inside a block of size 464 free'd
==397==    at 0x4C252AF: free (vg_replace_malloc.c:323)
==397==    by 0xB8F51C: my_no_flags_free (my_malloc.c:66)
==397==    by 0xBBE83B: free_root (my_alloc.c:349)
==397==    by 0x9263BF: sp_head::execute(THD*) (sp_head.cc:1263)
==397==    by 0x9271B9: sp_head::execute_procedure(THD*, List<Item>*) 
(sp_head.cc:1988)
==397==    by 0x72054E: mysql_execute_command(THD*) (sql_parse.cc:4398)
==397==    by 0x72206C: mysql_parse(THD*, char const*, unsigned, char 
const**) (sql_parse.cc:5973)
==397==    by 0x722C5A: dispatch_command(enum_server_command, THD*, 
char*, unsigned) (sql_parse.cc:1090)
==397==    by 0x72413A: do_command(THD*) (sql_parse.cc:774)
==397==    by 0x711104: do_handle_one_connection(THD*) (sql_connect.cc:1173)
==397==    by 0x7111C9: handle_one_connection (sql_connect.cc:1113)
==397==    by 0xC4AC33: pfs_spawn_thread(void*) (pfs.cc:1011)
==397==    by 0x503B3E9: start_thread (in /lib/libpthread-2.8.90.so)

The line doing wrong access is:
       /* This is SJ-Materialization with lookups */
       COST_VECT prefix_cost;
       signed int first_tab= (int)idx - mat_info->tables; // here
because mat_info is bad data.
This is bad data because:
1) at first EXECUTE (when materialization is on): mat_info is created by
optimize_semijoin_nests(): the "new" operator allocates in the execution
MEM_ROOT, which is later freed in sp_head::execute() at sp_head.cc:1263:
free_root(&execute_mem_root, MYF(0));
2) at second EXECUTE (when materialization has just been turned off): 
mat_info is not created by optimize_semijoin_nests(); the freed pointer 
thus stays, and is used by advance_sj_state().

One could say that:
a) stage (2) is correct to not create mat_info, as it's not doing 
materialization
b) stage (1) is incorrect because it leaves a dangling pointer behind; 
it should cleanup after itself.
Such patch:
=== modified file 'sql/sql_select.cc'
--- sql/sql_select.cc	2010-01-18 19:21:20 +0000
+++ sql/sql_select.cc	2010-07-27 20:44:08 +0000
@@ -10634,6 +10634,8 @@
      table->reginfo.join_tab= 0;
    }
    end_read_record(&read_record);
+  if (emb_sj_nest)
+    emb_sj_nest->sj_mat_info= NULL;
  }
which adds a proper reset of sj_mat_info to JOIN_TAB::cleanup(), solves 
the bug's testcase. It has some logic: no reference to memory allocated 
in the execution MEM_ROOT is left.
I suggest we discuss this further on IRC.
Thread
bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng27 Jul
  • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot27 Jul
  • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot9 Aug
    • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng12 Aug
      • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Guilhem Bichot12 Aug
        • Re: bzr commit into mysql-next-mr-opt-backporting branch (roy.lyseng:3219)Bug#50489Roy Lyseng12 Aug