Can you try passing the dump file through hexdump or some binary editor to
see if the data is there. Most text editors will treat 0x00 as end of string
and this most likely this is causing the problem.
Additionally you can try running the import with --default-character-set=utf8
in case the default charset is something else
mysql --default-character-set=utf8 --user=me test_database < dump_file
Thanks for the reply, and I apologize because I expect I've broken
threading. The list isn't mailing the posts to me, so I've nothing to
reply to. I've had to cut and paste from the web archive...
>>>> 2014/01/06 12:18 +0000, Dave Howorth >>>>
>> Everything appears to work except that text fields containing a
>> Unicode non-breaking space (0x00A0) are truncated just before that
>> character. I can see the field in the dump file and it looks OK, but
>> it doesn't all make it into the new database.
> Well, there are too many aspects to this, but the first is the
> character set that "mysql" expects for input. If, say, it is USASCII
> (note that between the character set that "mysql" takes for input and
> the character set in the table no association is needful), the "nbsp"
> is out of range.
Hmm, is there any way to tell what character set mysql expects, or
better yet to tell it what to read? Or can I tell mysqldump to encode
its output differently?
(I promise to RTFM, but want to get this question out there whilst I'm
> (It is, of course, not nice if "mysqldump" yields an output that
> "mysql" cannot read.)
Indeed; I'd go so far as to call that a bug. But that does seem to be
> Try entering it with some escape-sequence (this one is based on the
> original SQL with features from PL1, not from C, which MySQL supports
> if 'ANSI' is in "sql_mode"):
I don't understand the 'sql_mode', though I expect I can look that up
too. But I did try these:
> 'some text ... ' || X'A0' || ' ... more text ...'
causes the contents of the field to be '1'.
> or (slightly less PL1)
> CONCAT('some text ... ', X'A0', ' ... more text ...')
Produces the same effect as embedding the character directly. i.e. the
value of the field is truncated just before the problem character.
However, substituting for the character with the string ' ' does
allow mysql to read past it. I've now discovered that it also blows up
on some other characters with the top bit set such as 0x91. What's
strange about that is that they used to work. So my first thought now is
that something has changed recently. Perhaps an update to one of the
servers or clients involved? I don't remember changing anything in my
code, but I can't be absolutely sure.
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe: http://lists.mysql.com/mysql