List:Commits« Previous MessageNext Message »
From:Alexander Barkov Date:November 12 2010 7:26pm
Subject:Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306
View as plain text  
Hi Guilhem,

Thanks for your review. Please find comments inline.

Guilhem Bichot wrote:
> Hello,
> 
> Alexander Barkov a écrit, Le 12.11.2010 15:45:
>> #At file:///home/bar/mysql-bzr/mysql-5.5-bugteam.b57306/ based on 
>> revid:vvaintroub@stripped
>>
>>  3121 Alexander Barkov    2010-11-12
>>       Bug#57306 SHOW PROCESSLIST does not display string literals well.
>>             Problem: Extended characters outside of ASCII range where 
>> not displayed
>>       properly in SHOW PROCESSLIST, because thd_info->query was always 
>> sent as       system_character_set (utf8). This was wrong, because 
>> query buffer
>>       is never converted to utf8 - it is always have client character 
>> set.
>>             Fix: sending query buffer using character_set_client of 
>> its thread.
>>
>>     modified:
>>       mysql-test/r/show_check.result
>>       mysql-test/t/show_check.test
>>       sql/sql_show.cc
> 
>> === modified file 'mysql-test/t/show_check.test'
>> --- a/mysql-test/t/show_check.test    2010-07-27 14:32:42 +0000
>> +++ b/mysql-test/t/show_check.test    2010-11-12 14:45:32 +0000
>> @@ -1328,3 +1328,24 @@ disconnect con1;
>>  # Wait till all disconnects are completed
>>  --source include/wait_until_count_sessions.inc
>>  
>> +--echo #
>> +--echo # Bug#57306 SHOW PROCESSLIST does not display string literals 
>> well.
>> +--echo #
>> +
>> +SET NAMES latin1;
>> +SELECT GET_LOCK('t', 1000);
>> +--connect (con1,localhost,root,,)
>> +--connection con1
>> +SET NAMES latin1;
>> +--send SELECT GET_LOCK('t',1000) AS 'óóóó';
>> +--connection default
>> +--replace_column 1 ### 6 ### 7 ###
>> +SHOW PROCESSLIST;
>> +SET NAMES utf8;
>> +--replace_column 1 ### 6 ### 7 ###
>> +SHOW PROCESSLIST;
>> +SELECT RELEASE_LOCK('t');
>> +--connection con1
>> +--reap
>> +--disconnect con1
>> +--connection default
> 
> The default connection did "SET NAMES latin1" and then "SET NAMES utf8". 
> I guess this last command restores the connection's state to what it was 
> before "SET NAMES latin1"?

I added this not really for purpose of character set restoring.
My idea was to check that character set conversion for the "Info"
column work correctly, so I am testing two different character sets,
pretending that I have two different clients reading "SHOW PROCESSLIST",
using latin1 and using utf8.


The first "SHOW PROCESSLIST" (after "SET NAMES latin1")
prints exactly what con1 sent, because con1 uses latin1
as well. So no conversion happens here. The default thread
gets result which is binary equal to what con1 sent.
You can see it as "SMALL LETTER O WITH ACUTE" four times.
This test checks what the report is actually about.

The second "SHOW PROCESSLIST" (after "SET NAMES utf8") applies
conversion from latin1 to utf8, whose result is visible
as "CAPITAL LETTER A WITH TILDE" followed by "SUPERSCRIPT THREE"
four times. This is an extra test for better coverage.


Details:
Encoding for the test code I added in show_check.test is utf8.
So when I put a letter "SMALL LETTER O WITH ACUTE ABOVE" into
show_check.test, I actually put two bytes 0xC3B3.

As I previously use "SET NAMES latin1" on the con1 thread,
I make server interpret 0xC3B3 as latin1 data.

Now if you have a look here:
ftp://ftp.unicode.org/Public/MAPPINGS/ISO8859/8859-1.TXT
0xC3	0x00C4	#	LATIN CAPITAL LETTER A WITH TILDE
0xB3	0x00B3	#	SUPERSCRIPT THREE

which is exactly what we can see in the results for the
second "SHOW PROCESSLIST", with utf8.


This proves that conversion from latin1 to utf8 happened correctly.


> 
>> === modified file 'sql/sql_show.cc'
>> --- a/sql/sql_show.cc    2010-11-02 14:02:16 +0000
>> +++ b/sql/sql_show.cc    2010-11-12 14:45:32 +0000
>> @@ -1728,6 +1728,7 @@ public:
>>    uint   command;
>>    const char *user,*host,*db,*proc_info,*state_info;
>>    char *query;
>> +  CHARSET_INFO *character_set_client;
>>  };
>>  
>>  #ifdef HAVE_EXPLICIT_TEMPLATE_INSTANTIATION
>> @@ -1833,6 +1834,7 @@ void mysqld_list_processes(THD *thd,cons
>>          }
>>          mysql_mutex_unlock(&tmp->LOCK_thd_data);
>>          thd_info->start_time= tmp->start_time;
>> +        thd_info->character_set_client= 
>> tmp->variables.character_set_client;
> 
> This is an "unprotected" read of another thread's data 
> (tmp->variables.character_set_client). Imagine that this other thread is 
> precisely in the process of changing its @@character_set_client: we may 
> be reading a pointer value while it's changing, which may in theory give 
> garbage (some bytes of the pointer have changed and some not yet); 
> imagine this garbage pointer is an invalid pointer (not pointing to any 
> charset); then when we print the query in SHOW PROCESSLIST, we'll 
> dereference this invalid pointer and crash?
> I read somewhere that on modern Intel CPUs, 64-bit reads are atomic, but 
> it's always unclear what that exactly means and what happens for other 
> CPUs...
> So I don't feel very confident about this. Or is there no problem?
> 
> I see the query is read for LOCK_thd_data.

"query" is copied using strmake:

thd_info->query= (char*) thd->strmake(tmp->query(),length);

which is of course not atomic. We do need a mutex here.


But I think there is no problem with copying CHARSET_INFO pointer.
My understanding is that pointers use processor data types,
which should be copied atomically, on both 32-bit and 64-bit hardware.


> 
> If there is a problem, maybe we should change the code which updates 
> THD::variables::character_set_client to use LOCK_thd_data, or an atomic 
> operation; and above we would use the mutex or the atomic operation...?

I don't think we really need it.

> 
>>          thread_infos.append(thd_info);
>>        }
>>      }
>> @@ -1857,7 +1859,7 @@ void mysqld_list_processes(THD *thd,cons
>>      else
>>        protocol->store_null();
>>      protocol->store(thd_info->state_info, system_charset_info);
>> -    protocol->store(thd_info->query, system_charset_info);
>> +    protocol->store(thd_info->query, thd_info->character_set_client);
>>      if (protocol->write())
>>        break; /* purecov: inspected */
>>    }

Thread
bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Alexander Barkov12 Nov
  • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Guilhem Bichot12 Nov
    • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Alexander Barkov12 Nov
      • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Tor Didriksen15 Nov
        • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Alexander Barkov15 Nov
          • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Alexander Barkov15 Nov
            • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Guilhem Bichot15 Nov
              • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Alexander Barkov16 Nov
      • Re: bzr commit into mysql-5.5-bugteam branch (bar:3121) Bug#57306Guilhem Bichot15 Nov