I made the changes you suggested and put the two lines before every 'return 1' in the
function 'exec_event()'.
While running my sh-script to re-get all tables from the master only one table now makes
any trouble.
Table stats (Master):
Type: MyISAM Fixed raidtype=striped raid_chunks=4 raid_chunksize=4
Size: 485524542
Rows: 5220694
While loading this table in the sh-script it prints to the console 'Error 1051 at line 1:
Unknown table 'intra_stock' (see below) and
in the error.log:
Duplicate key for record at 111388425 against record 66183357
Quick-recover aborted; Run recovery without switch 'q' or with switch -qq
Warning: Retrying repair of: './stocks/intra_stock' without quick
Then millions of duplicate keys.
Table stats (Slave) after loading:
Type: MyISAM Fixed
Size: 113369325
Rows: 192877
Looks like a problem with raided tables (or any tables with extented create-options (raid,
temporary etc.)).
How can I undo the raiding without dumping and recreating the table ? (something like
raid_type=off or so ? Your site is just down, so I can't look in the manual)
NOW I make a 'drop table intra_stock' on the slave and it says "ERROR 1051: Unknown table
'intra_stock' ", but the table is
now dropped. (I make a drop before the load in the sh-script too, causing this error).
PS:
Could you put the 2 lines before every 'return 1' in the function 'exec_event()' in the
next releases ?
It are only a few more lines which do not slow things down (normaly the slave dies after),
but I have not to patch it in every new
release ;)
Regards,
Christian "Maverick" Rabe
IT-Development
wallstreet:online AG
------------------------------------------
http://www.wallstreet-online.de
Tel.: 03 37 01 5 29 16
Fax.: 03 37 01 5 29 19
----- Original Message -----
From: sasha@stripped
To: Christian Rabe
Cc: internals@stripped
Sent: Friday, October 27, 2000 9:09 PM
Subject: Re: Problems wit LOAD TABLE xyz FROM MASTER
3.23.27 replication-wise is identical to 3.23.26, so LOAD TABLE FROM MASTER
should work the same. Do you have a way of repeating the problem?
Another possiblity ( I would have to check the code to be sure), is that your
change to ignore the errors could leave some state variables in an inconsistent
state. Ok, I've taken a look at the code - found a couple of problems:
* each time you hit the slave error there was a small memory leak - about the
size of the query + a bit of overhead. With the slave thread aborting on error,
this was not as bad, although still a bug. If it did not abort, then you leak
about 100-200 bytes for every bad query. I have now fixed this.
* more serious problem - every time you hit the bad query, glob_mi structure,
which keeps track of the position of the slave thread in the master ( you can
see most of it with SHOW SLAVE STATUS) is not being updated. This is correct
behaviour - if the query could not be run, we exit the slave, notify sysadmin,
and when he fixes the conflict and restarts the slave thread, we will try this
query again. However, if you really want to ignore the error, you need to do
mi->inc_pos(event_len);
flush_master_info(mi);
in exec_event() before you return in the case when error is detected as well -
otherwise, when you restart the slave, it will start reading from the wrong
position. This is probably why you had that strange error with the slave
deciding to re-do one replication log - the offset was wrong, and it missed the
log rotation event.
--
MySQL Development Team
__ ___ ___ ____ __
/ |/ /_ __/ __/ __ \/ / Sasha Pachev <sasha@stripped>
/ /|_/ / // /\ \/ /_/ / /__ MySQL AB, http://www.mysql.com/
/_/ /_/\_, /___/\___\_\___/ Provo, Utah, USA
<___/