List:General Discussion« Previous MessageNext Message »
From:Bruce Dembecki Date:July 18 2005 5:35pm
Subject:Re: migrate from 3.x to 4.1 character set problem
View as plain text  
> hi,
> we've got an old mysql-3.23.58 and a new mysql-4.1.10a and we'de  
> like to
> migrate our data, but it doesn't seems to be so easy:-(
> out old server has a latin2 database. after we dump it and try tp  
> import
> into the new ones we always got errors or the spical accented  
> hungarian
> characters are getting wrong.
> - what is the prefered (and working) way to migrate from the old to  
> the
> new?
> - how can define the new char sets?
> we try these variations (and manualy create the database with defult
> char set and latin2):
> 1. mysqldump --opt -p xxx > xxx.sql
>     mysql xxx < xxx.sql
>
> 2. mysqldump --opt --default-character-set=latin2 -p xxx > xxx.sql
>     mysql --default-character-set=latin2 xxx < xxx.sql
>
> 3. mysqldump --opt -p xxx > xxx.sql
>     iconv -f ISO_8859-2 -t UTF-8 -o xxx2.sql xxx.sql
>     mysql xxx < xxx2.sql
>
> and many more combination, try to read all docs, but can't find any
> solutions.
> another question what is the collations latin2_hungarian_ci contains?
> how can i interpret that xml file? eg. a is equal to á or not? is  
> there
> any way to find out how is the buildin contains defined? or any  
> description?
> thank you for your help in advance.
> yours.

Generally speaking you need to define the character set for each  
column or table in your 4.1 database, or set a default character set  
for the database or for the server, this is independent of the  
default character set used by the clients... Then you need your  
clients to connect to the database using the appropriate character  
set... while the examples above seem correct, there are some  
opportunities for errors to occur.

Firstly export the data using the mysql tools provided with  
3.23.58... eg make sure you use the mysqldump that comes with the  
3.23.58 mysql binary - chances are that is will be mysqldump 3.23.58.  
I expect that version off mysqldump will not support the --default- 
character-set flag and should have thrown an error if you try to give  
it that flag... It's important that you export the 3.23.58 data the  
way it is, and let the 4.1 tools deal with putting it into the new  
format appropriately. using mysqldump from 4.1 may not give you  
exactly the same results, so you should avoid that. Also for what it  
is worth you may want to try doing a dump slightly differently... we  
always use --tab=/var/tmp/database or some such thing and that  
creates a series of files in the folder you specify, one .sql file  
for each table containing just the create table statement, and  
one .txt file for each table containing just the data for each table  
in tab delimited format. It means your import process will be  
slightly different, but it's faster, and because we have done it  
regularly it's more likely to handle the data conversion.

Next when doing the import make sure you use mysql tools that match  
the database you are installing. Here you will need to specify the  
default character set for the clients, they will understand and use  
that when speaking to the database. Here is the process we use to do  
the export from 4.0 and import into 4.1, there should be no great  
difference in how 3.23.58 and 4.0 handle the character sets so the  
results should be much the same. We use UTF8, and our 4.0 databases  
had no special character settings, so it was stored in the database  
as latin1. On the original server using 4.0.n server and tools to  
match we run this:

mysqldump --tab=/var/tmp/database database

You should be able to do the same thing provided you use mysqldump  
3.23.58, again make no allowances for character set in the dump  
process, you just want the data dumped to disk the same way it is  
stored now.

Then we move the directory /var/tmp/database to /var/tmp on the new  
server with 4.1 running... note this has the 4.1.n server AND the  
4.1.n tools (such as mysql, mysqldump, mysqlimport and so on).

Finally we go ahead and import our data into the server using this  
sequence of commands (we use a shell script, so that's what you get  
here). Call the shell script by giving it the database name as a flag  
(eg ./import database) - watch for differences in line breaks caused  
by email clients here, there are three lines of commands after  
setting DB=$1.

#!/bin/sh
#
# LiveWorld's MySQL Import Script
# Use for converting 4.0 databases to 4.1 UTF8 databases
# Suitable for LiveWorld Servers only, use at your own risk
#

DB=$1

mysql -e "CREATE DATABASE $DB default character set utf8;"
cat /var/tmp/$DB/*sql | mysql --socket=/tmp/mysql.sock $DB
mysqlimport --default-character-set=utf8 $DB /var/tmp/$DB/*txt

Obviously you are going from latin2 to latin2 so it should be a  
little easier for you than it was for us... and you'll want to make  
some changes in the script compared to our utf8 stuff (of course you  
may want to just go with utf8 anyway, should handle most anything you  
want to throw at it that way, our databases run in 30 languages).

So be careful to match your tools with your server version and try to  
dump to tab delimited, it'll probably work for you. I know it's  
worked for us in moving to utf8... and our setup wasn't as clean as  
yours sounds to be.

Best Regards, Bruce
Thread
migrate from 3.x to 4.1 character set problemFarkas Levente18 Jul
Re: migrate from 3.x to 4.1 character set problemBruce Dembecki18 Jul