List:Internals« Previous MessageNext Message »
From:Paul DuBois Date:October 18 2002 4:20pm
Subject:Question about CONVERT(str,charset_to,charset_from)
View as plain text  
I've been puzzling over this patch, which implements a form of the
CONVERT() function.  I can see that this can be useful for specifying
the destination character set as a string expression rather than as
an unquoted character set name.  But I'm wondering why the second argument
is necessary at all.  Strings have a charset already, why do you have
to specify what it is?


At 19:11 +0400 3/29/02, bar@stripped wrote:
>Below is the list of changes that have just been committed into a
>4.1 repository of bar. When bar does a push, they will be propogated to
>the main repository and within 24 hours after the push to the public 
>repository.
>For information on how to access the public repository
>see http://www.mysql.com/doc/I/n/Installing_source_tree.html
>
>ChangeSet
>   1.1178 02/03/29 19:11:06 bar@stripped +3 -0
>   Now this syntax works too:  CONVERT(string,charset_to,charset_from)
>   where charset_to and charset_from are expressions. For example:
>
>   CONVERT('test','latin2','cp1250')
>
>   sql/sql_yacc.yy
>     1.155 02/03/29 19:11:05 bar@stripped +4 -0
>     Now this syntax works too:  CONVERT(string,charset_to,charset_from)
>
>   sql/item_strfunc.h
>     1.18 02/03/29 19:11:04 bar@stripped +10 -0
>     Now this syntax works too:  CONVERT(string,charset_to,charset_from)
>
>   sql/item_strfunc.cc
>     1.42 02/03/29 19:11:04 bar@stripped +73 -0
>     Now this syntax works too:  CONVERT(string,charset_to,charset_from)

Also, it appears to me that the names of the second and third arguments
in the preceding descriptions is backward, because the function result has
the charset of the third argument, not the second:

mysql> select charset(convert('abc','latin1','utf8'));
+-----------------------------------------+
| charset(convert('abc','latin1','utf8')) |
+-----------------------------------------+
| utf8                                    |
+-----------------------------------------+
mysql> select charset(convert('abc','utf8','latin1'));
+-----------------------------------------+
| charset(convert('abc','utf8','latin1')) |
+-----------------------------------------+
| latin1                                  |
+-----------------------------------------+

>
># This is a BitKeeper patch.  What follows are the unified diffs for the
># set of deltas contained in the patch.  The rest of the patch, the part
># that BitKeeper cares about, is below these diffs.
># User:	bar
># Host:	gw.udmsearch.izhnet.ru
># Root:	/usr/home/bar/mysql-4.1
>
>--- 1.41/sql/item_strfunc.cc	Fri Mar 29 18:22:18 2002
>+++ 1.42/sql/item_strfunc.cc	Fri Mar 29 19:11:04 2002
>@@ -1843,6 +1843,79 @@
>    /* BAR TODO: What to do here??? */
>  }
>
>+
>+String *Item_func_conv_charset3::val_str(String *str)
>+{
>+  my_wc_t wc;
>+  int cnvres;
>+  const uchar *s, *se;
>+  uchar *d, *d0, *de;
>+  uint dmaxlen;
>+  String *arg= args[0]->val_str(str);
>+  String *to_cs= args[1]->val_str(str);
>+  String *from_cs= args[2]->val_str(str);
>+  CHARSET_INFO *from_charset;
>+  CHARSET_INFO *to_charset;
>+ 
>+  if (!arg     || args[0]->null_value ||
>+      !to_cs   || args[1]->null_value ||
>+      !from_cs || args[2]->null_value ||
>+      !(from_charset=find_compiled_charset_by_name(from_cs->ptr())) ||
>+      !(to_charset=find_compiled_charset_by_name(to_cs->ptr())))
>+  {
>+    null_value=1;
>+    return 0;
>+  }
>+
>+  s=(const uchar*)arg->ptr();
>+  se=s+arg->length();
>+ 
>+  dmaxlen=arg->length()*(to_charset->mbmaxlen?to_charset->mbmaxlen:1)+1;
>+  str->alloc(dmaxlen);
>+  d0=d=(unsigned char*)str->ptr();
>+  de=d+dmaxlen;
>+ 
>+  while( s < se && d < de){
>+
>+    cnvres=from_charset->mb_wc(from_charset,&wc,s,se);
>+    if (cnvres>0)
>+    {
>+      s+=cnvres;
>+    }
>+    else if (cnvres==MY_CS_ILSEQ)
>+    {
>+      s++;
>+      wc='?';
>+    }
>+    else
>+      break;
>+
>+outp:
>+    cnvres=to_charset->wc_mb(to_charset,wc,d,de);
>+    if (cnvres>0)
>+    {
>+      d+=cnvres;
>+    }
>+    else if (cnvres==MY_CS_ILUNI && wc!='?')
>+    {
>+        wc='?';
>+        goto outp;
>+    }
>+    else
>+      break;
>+  };
>+ 
>+  str->length((uint) (d-d0));
>+  str->set_charset(to_charset);
>+  return str;
>+}
>+
>+void Item_func_conv_charset3::fix_length_and_dec()
>+{
>+  /* BAR TODO: What to do here??? */
>+}
>+
>+
>  String *Item_func_hex::val_str(String *str)
>  {
>    if (args[0]->result_type() != STRING_RESULT)
>
>--- 1.17/sql/item_strfunc.h	Fri Mar 29 18:22:19 2002
>+++ 1.18/sql/item_strfunc.h	Fri Mar 29 19:11:04 2002
>@@ -489,6 +489,16 @@
>    const char *func_name() const { return "conv_charset"; }
>  };
>
>+class Item_func_conv_charset3 :public Item_str_func
>+{
>+public:
>+  Item_func_conv_charset3(Item *arg1,Item *arg2,Item *arg3)
>+    :Item_str_func(arg1,arg2,arg3) {}
>+  String *val_str(String *);
>+  void fix_length_and_dec();
>+  const char *func_name() const { return "conv_charset3"; }
>+};
>+
>
>  /*******************************************************
>  Spatial functions
>
>--- 1.154/sql/sql_yacc.yy	Fri Mar 29 18:22:20 2002
>+++ 1.155/sql/sql_yacc.yy	Fri Mar 29 19:11:05 2002
>@@ -1664,6 +1664,10 @@
>  	    }
>  	    $$= new Item_func_conv_charset($3,cs);
>  	  }
>+	| CONVERT_SYM '(' expr ',' expr ',' expr ')'
>+	  {
>+	    $$= new Item_func_conv_charset3($3,$5,$7);
>+	  }
>  	| FUNC_ARG0 '(' ')'
>  	  { $$= ((Item*(*)(void))($1.symbol->create_func))();}
>  	| FUNC_ARG1 '(' expr ')'
>

Thread
bk commit into 4.1 treebar29 Mar
  • Re: bk commit into 4.1 treePaul DuBois29 Mar
  • Question about CONVERT(str,charset_to,charset_from)Paul DuBois18 Oct
    • Re: Question about CONVERT(str,charset_to,charset_from)Peter Zaitsev18 Oct
      • Re: Question about CONVERT(str,charset_to,charset_from)Paul DuBois18 Oct
  • Re: Question about CONVERT(str,charset_to,charset_from)Shurik) Barkov20 Oct
    • Re: Question about CONVERT(str,charset_to,charset_from)Michael Widenius20 Oct
Re: Question about CONVERT(str,charset_to,charset_from)Paul DuBois18 Oct