Actually I have an somewhat reproducible test case.
Problem is, it only works on my Server.
When I start MySQL it crashes almost exactly after one minute. This
happens every single time:
Oct 7 21:41:04 h1314631 mysqld_safe[18591]: started
Oct 7 21:41:04 h1314631 mysqld[18594]: 071007 21:41:04 [Warning] The
syntax for replication startup options is deprecated and will be removed
in MySQL 5.2. Please use 'CHANGE MASTER' instead.
Oct 7 21:41:06 h1314631 mysqld[18594]: 071007 21:41:06 InnoDB:
Started; log sequence number 0 1353332480
Oct 7 21:41:07 h1314631 mysqld[18594]: 071007 21:41:07 [Note]
/usr/sbin/mysqld: ready for connections.
Oct 7 21:41:07 h1314631 mysqld[18594]: Version:
'5.0.45-Debian_1~bpo.1-debug' socket: '/var/run/mysqld/mysqld.sock'
port: 3306 Debian etch distribution
Oct 7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [Warning]
Neither --relay-log nor --relay-log-index were used; so replication may
break when this MySQL server acts as a slave and has his hostname
changed!! Please use '--relay-log=/var/run/mysqld/mysqld-relay-bin' to
avoid this problem.
Oct 7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [ERROR] Failed
to open the relay log '/var/run/mysqld/mysqld-relay-bin.000045'
(relay_log_pos 2575)
Oct 7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [ERROR] Could
not find target log during relay log initialization
Oct 7 21:42:38 h1314631 mysqld_safe[18826]: Number of processes running
now: 0
Oct 7 21:42:38 h1314631 mysqld_safe[18832]: restarted
Oct 7 21:42:38 h1314631 mysqld[18836]: 071007 21:42:38 [Warning] The
syntax for replication startup options is deprecated and will be removed
in MySQL 5.2. Please use 'CHANGE MASTER' instead.
Oct 7 21:42:38 h1314631 mysqld[18836]: 071007 21:42:38 InnoDB:
Database was not shut down normally!
Oct 7 21:42:38 h1314631 mysqld[18836]: InnoDB: Starting crash recovery.
Oct 7 21:42:38 h1314631 mysqld[18836]: InnoDB: Reading tablespace
information from the .ibd files...
Oct 7 21:51:22 h1314631 mysqld[18836]: InnoDB: Restoring possible
half-written data pages from the doublewrite
Oct 7 21:51:22 h1314631 mysqld[18836]: InnoDB: buffer...
Oct 7 21:51:22 h1314631 mysqld[18836]: 071007 21:51:22 InnoDB:
Starting log scan based on checkpoint at
Oct 7 21:51:22 h1314631 mysqld[18836]: InnoDB: log sequence number 0
1353332894.
Oct 7 21:51:22 h1314631 mysqld[18836]: InnoDB: Doing recovery: scanned
up to log sequence number 0 1353333356
Oct 7 21:51:22 h1314631 mysqld[18836]: 071007 21:51:22 InnoDB:
Starting an apply batch of log records to the database...
Oct 7 21:51:23 h1314631 mysqld[18836]: InnoDB: Progress in percents: 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88
89 90 91 92 93 94 95 96 97 98 99
Oct 7 21:51:23 h1314631 mysqld[18836]: InnoDB: Apply batch completed
Oct 7 21:51:23 h1314631 mysqld[18836]: 071007 21:51:23 InnoDB:
Started; log sequence number 0 1353333356
Oct 7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [ERROR] Error
reading master configuration
Oct 7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [ERROR] Failed
to initialize the master info structure
Oct 7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [Note]
/usr/sbin/mysqld: ready for connections.
Oct 7 21:51:24 h1314631 mysqld[18836]: Version:
'5.0.45-Debian_1~bpo.1-debug' socket: '/var/run/mysqld/mysqld.sock'
port: 3306 Debian etch distribution
Maybe this helps somebody?
Regards,
Samy
Samuel Vogel schrieb:
> I did some more research:
>
> I ran "xfs_check" on the partition and it didn't report any errors at
> all, so I'm guessing, that the hard drives are ok.
> I also recompiled MySQL with "--with-debug".
>
> How would I get some debug output about the crashes now?
>
> Thanks,
> Samy
>
> Samuel Vogel schrieb:
>> Hey guys,
>>
>> First of all: I have tried to post this Issue on forums.mysql.com,
>> but the Topic doesn't appear in the Forum. When I try to post again,
>> it tells me that it's a duplicate?!?!
>>
>> Now the real problem:
>> I have MySQL set up on two Servers with 7000 Users each and about the
>> same amount of databases.
>> Since two days ago, there is corruption going on in many databases.
>> And MySQL crashes every other hour or so.
>> Last night I updated from 5.0.32 to 5.0.45, but the problem is still
>> there.
>>
>> Here is what I see in syslog:
>> Oct 7 11:02:53 h1314631 mysqld[32490]: 071007 11:02:53 [ERROR]
>> /usr/sbin/mysqld: Table './10temulti@1-12/dzcp_counter_ips' is marked
>> as crashed and should be repaired
>> Oct 7 11:02:53 h1314631 mysqld[32490]: 071007 11:02:53 [ERROR]
>> /usr/sbin/mysqld: Table './10temulti@1-12/dzcp_counter_ips' is marked
>> as crashed and should be repaired
>> Oct 7 11:03:23 h1314631 mysqld_safe[32724]: Number of processes
>> running now: 0
>> Oct 7 11:03:23 h1314631 mysqld_safe[32729]: restarted
>> Oct 7 11:03:23 h1314631 mysqld[32734]: 071007 11:03:23 InnoDB:
>> Database was not shut down normally!
>> Oct 7 11:03:23 h1314631 mysqld[32734]: InnoDB: Starting crash recovery.
>> Oct 7 11:03:23 h1314631 mysqld[32734]: InnoDB: Reading tablespace
>> information from the .ibd files...
>> Oct 7 11:08:36 h1314631 mysqld[32734]: InnoDB: Restoring possible
>> half-written data pages from the doublewrite
>> Oct 7 11:08:36 h1314631 mysqld[32734]: InnoDB: buffer...
>> Oct 7 11:08:36 h1314631 mysqld[32734]: 071007 11:08:36 InnoDB:
>> Starting log scan based on checkpoint at
>> Oct 7 11:08:36 h1314631 mysqld[32734]: InnoDB: log sequence number 0
>> 1346871925.
>> Oct 7 11:08:36 h1314631 mysqld[32734]: InnoDB: Doing recovery:
>> scanned up to log sequence number 0 1346871925
>> Oct 7 11:08:36 h1314631 mysqld[32734]: 071007 11:08:36 InnoDB:
>> Started; log sequence number 0 1346871925
>> Oct 7 11:08:38 h1314631 mysqld[32734]: 071007 11:08:38 [Note]
>> /usr/sbin/mysqld: ready for connections.
>> Oct 7 11:08:38 h1314631 mysqld[32734]: Version:
>> '5.0.45-Debian_1~bpo.1' socket: '/var/run/mysqld/mysqld.sock' port:
>> 3306 Debian etch distribution
>>
>> As far as I understand, this means, that the MySQL Server crashed and
>> mysqld_safe noticed that and restarted it.
>> I also see much database corruption, but I somewhat run into a
>> chicken & egg problem here. I don't know if the databases corruption
>> appeared first and led into the crashes or if the crashes led to the
>> corruption.
>>
>> How can I investigate further into the problem? I don't think that a
>> perticular query is crashing the system, since all of our users just
>> run well known apps like phpBB etc.
>>
>> To clarify my situtaion I have just started a "myisamchk --silent
>> --force --update-state --recover" for all tables that on my system.
>> It gives me, among errors it could repair, to error messages, which I
>> could find much about with Google or the MySQL docs:
>>
>> myisamchk: error: 138 when opening MyISAM-table
>> '/data/mysql/.../transcache.MYI'
>>
>> and
>>
>> myisamchk: Unknown error 126
>> myisamchk: error: '/data/mysql/.../smf_membergroups.MYI' doesn't have
>> a correct index definition. You need to recreate it before you can do
>> a repair
>>
>>
>> What makes me wonder too, is that the database corruption is
>> happening on both Servers, but the MySQL crashes only appear on one
>> of them.
>>
>> I'm running Debian Etch and the MySQL data dir is on an XFS
>> partition. I have mounted the partition with "noatime".
>> How would I investigate a potential hard drive error?
>>
>> Can anybody shade some light on my situation?
>>
>> Regards,
>> Samy
>>
>