Hi!
First - this should be done in 5.1, not earlier.
On Sep 26, gshchepa@stripped wrote:
> ChangeSet@stripped, 2007-09-26 18:22:25+02:00, gshchepa@stripped +5 -0
> Fixed bug #30946.
> When mysqldump was used with both --default-character-set and
> --tab options the database content in tab-separated data files
> was always in its internal character set and the
> --default-character-set switch was ignored.
>
> The mysqldump client uses the "SELECT * INTO OUTFILE" statement
> to provide the --tab command line switch logic. It is supposed
> to use the symmetrical "LOAD DATA INFILE" statement to load a
> resulting data file back to a database.
> By the definition the "LOAD DATA INFILE" statement interprets
> all fields in the file as having the same character set, regardless
> of the data types of the columns into which field values are loaded.
> The character_set_database system variable or the "CHARACTER
> SET" closure of the "LOAD DATA INFILE" statement is used to define
> that character set. However, the "SELECT ... INTO OUTFILE" statement
> dumps table fields in its internal character sets, so "SELECT INTO
> OUTFILE" and "LOAD DATA INFILE" statements are not symmetrical.
> To not to modify current server semantic (only to extend it) the
> "SELECT ... INTO OUTFILE" statement syntax was extended with the
> "CHARACTER SET" statement like in the "LOAD DATA INFILE".
>
> Documentation should be updated:
> [ http://dev.mysql.com/doc/refman/5.1/en/select.html ]
> SELECT ... INTO OUTFILE is the complement of LOAD DATA INFILE;
> the syntax for the export_options part of the statement consists of
> the same CHARACTER SET, FIELDS and LINES clauses that are used with
> +^^^^^^^^^^^^^^
> the LOAD DATA INFILE statement.
Changeset comment is too long. Make it shorter, for example:
============
Bug#30946 mysqldump silently ignores --default-character-set when used with --tab
Added CHARACTER SET clause to the SELECT ... INTO OUTFILE (to complement
the same clause in LOAD DATA INFILE). mysqldump is updated to use in in
--tab mode.
============
> diff -Nrup a/client/mysqldump.c b/client/mysqldump.c
> --- a/client/mysqldump.c 2007-08-31 13:53:54 +02:00
> +++ b/client/mysqldump.c 2007-09-26 18:21:24 +02:00
> @@ -2381,6 +2381,12 @@ static void dump_table(char *table, char
> dynstr_append_checked(&query_string, filename);
> dynstr_append_checked(&query_string, "'");
>
> + if (default_charset != mysql_universal_client_charset)
> + {
> + dynstr_append_checked(&query_string, " CHARACTER SET ");
> + dynstr_append_checked(&query_string, default_charset);
> + }
No. First, it doesn't matter whether default_charset is equal to
mysql_universal_client_charset or not. You always need to specify the
charset otherwise you won't be able to dump some tables (e.g. CREATE
TABLE t1 (a CHAR(10) CHARSET koi8r, b CHAR(10) CHARSET latin1)).
Second, you need to wrap the new clause in /*!version ... */ comment.
> if (fields_terminated || enclosed || opt_enclosed || escaped)
> dynstr_append_checked(&query_string, " FIELDS");
> diff -Nrup a/mysql-test/t/mysqldump.test b/mysql-test/t/mysqldump.test
> --- a/mysql-test/t/mysqldump.test 2007-07-21 01:49:17 +02:00
> +++ b/mysql-test/t/mysqldump.test 2007-09-26 18:21:32 +02:00
> @@ -1577,5 +1577,27 @@ SELECT * FROM t2;
> DROP TABLE t1,t2;
>
> --echo #
> +--echo # Bug #30946: mysqldump silently ignores --default-character-set
> +--echo # when used with --tab
> +--echo #
> +
> +CREATE TABLE t1 (a CHAR(10));
test on a table with columns in different charsets.
> +INSERT INTO t1 VALUES (_latin1 'ÀÁÂÃÄÅ ');
> +
> +--eval SELECT * INTO OUTFILE '$MYSQLTEST_VARDIR/tmp/t1_default.txt' FROM t1;
> +--exec cat $MYSQLTEST_VARDIR/tmp/t1_default.txt
> +--exec rm -f $MYSQLTEST_VARDIR/tmp/t1_default.txt
use built-in mysqltest commands
--cat_file
--remove_file
> +--eval SELECT * INTO OUTFILE '$MYSQLTEST_VARDIR/tmp/t1_utf8.txt' CHARACTER SET utf8
> FROM t1;
> +--exec cat $MYSQLTEST_VARDIR/tmp/t1_utf8.txt
> +--exec rm -f $MYSQLTEST_VARDIR/tmp/t1_utf8.txt
> +
> diff -Nrup a/sql/sql_class.cc b/sql/sql_class.cc
> --- a/sql/sql_class.cc 2007-08-02 02:39:10 +02:00
> +++ b/sql/sql_class.cc 2007-09-26 18:21:30 +02:00
> @@ -1243,8 +1243,15 @@ bool select_export::send_data(List<Item>
>
> DBUG_ENTER("select_export::send_data");
> char buff[MAX_FIELD_WIDTH],null_buff[2],space[MAX_FIELD_WIDTH];
> + char buff2[MAX_FIELD_WIDTH];
> bool space_inited=0;
> String tmp(buff,sizeof(buff),&my_charset_bin),*res;
> + /*
> + String to convert an item string value into a value with
> + the exchange->cs chanrset
> + */
> + String item2exch(buff2, sizeof(buff2),
> + exchange->cs ? exchange->cs : &my_charset_bin);
> tmp.length(0);
>
> if (unit->offset_limit_cnt)
> @@ -1264,6 +1271,19 @@ bool select_export::send_data(List<Item>
> {
> Item_result result_type=item->result_type();
> res=item->str_result(&tmp);
> + if (exchange->cs && exchange->cs != res->charset())
> + {
> + res= item->str_result(&tmp);
> + item2exch.length(0);
> + /*
> + Simple item2exch.append(*res) doesn't convert anything,
> + String::append(const char, uing32, CHARSET_INFO*) has been used instead
> + of that.
> + TODO: fix String::append(const String&).
> + */
> + item2exch.append(res->c_ptr(), res->length(), res->charset());
> + res= &item2exch;
No. First, you also need to convert enclosed, field_term, line_term,
line_start, escaped. And "\N", and "NULL". And whatever other strings
are written in the output by SELECT ... OUTFILE.
Also you need to replace character_set_client checks with exchange->cs
checks.
> + }
> if (res && (!exchange->opt_enclosed || result_type ==
> STRING_RESULT))
> {
> if (my_b_write(&cache,(byte*) exchange->enclosed->ptr(),
Regards / Mit vielen Grüssen,
Sergei
--
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sergei Golubchik <serg@stripped>
/ /|_/ / // /\ \/ /_/ / /__ Principal Software Developer
/_/ /_/\_, /___/\___\_\___/ MySQL GmbH, Dachauer Str. 37, D-80335 München
<___/ Geschäftsführer: Kaj Arnö - HRB
München 162140