List:General Discussion« Previous MessageNext Message »
From:Samuel Vogel Date:October 7 2007 8:16pm
Subject:Re: MySQL Server crashing & database corruption
View as plain text  
Actually I have an somewhat reproducible test case.
Problem is, it only works on my Server.

When I start MySQL it crashes almost exactly after one minute. This 
happens every single time:

Oct  7 21:41:04 h1314631 mysqld_safe[18591]: started
Oct  7 21:41:04 h1314631 mysqld[18594]: 071007 21:41:04 [Warning] The 
syntax for replication startup options is deprecated and will be removed 
in MySQL 5.2. Please use 'CHANGE MASTER' instead.
Oct  7 21:41:06 h1314631 mysqld[18594]: 071007 21:41:06  InnoDB: 
Started; log sequence number 0 1353332480
Oct  7 21:41:07 h1314631 mysqld[18594]: 071007 21:41:07 [Note] 
/usr/sbin/mysqld: ready for connections.
Oct  7 21:41:07 h1314631 mysqld[18594]: Version: 
'5.0.45-Debian_1~bpo.1-debug'  socket: '/var/run/mysqld/mysqld.sock'  
port: 3306  Debian etch distribution
Oct  7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [Warning] 
Neither --relay-log nor --relay-log-index were used; so replication may 
break when this MySQL server acts as a slave and has his hostname 
changed!! Please use '--relay-log=/var/run/mysqld/mysqld-relay-bin' to 
avoid this problem.
Oct  7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [ERROR] Failed 
to open the relay log '/var/run/mysqld/mysqld-relay-bin.000045' 
(relay_log_pos 2575)
Oct  7 21:41:36 h1314631 mysqld[18594]: 071007 21:41:36 [ERROR] Could 
not find target log during relay log initialization
Oct  7 21:42:38 h1314631 mysqld_safe[18826]: Number of processes running 
now: 0
Oct  7 21:42:38 h1314631 mysqld_safe[18832]: restarted
Oct  7 21:42:38 h1314631 mysqld[18836]: 071007 21:42:38 [Warning] The 
syntax for replication startup options is deprecated and will be removed 
in MySQL 5.2. Please use 'CHANGE MASTER' instead.
Oct  7 21:42:38 h1314631 mysqld[18836]: 071007 21:42:38  InnoDB: 
Database was not shut down normally!
Oct  7 21:42:38 h1314631 mysqld[18836]: InnoDB: Starting crash recovery.
Oct  7 21:42:38 h1314631 mysqld[18836]: InnoDB: Reading tablespace 
information from the .ibd files...
Oct  7 21:51:22 h1314631 mysqld[18836]: InnoDB: Restoring possible 
half-written data pages from the doublewrite
Oct  7 21:51:22 h1314631 mysqld[18836]: InnoDB: buffer...
Oct  7 21:51:22 h1314631 mysqld[18836]: 071007 21:51:22  InnoDB: 
Starting log scan based on checkpoint at
Oct  7 21:51:22 h1314631 mysqld[18836]: InnoDB: log sequence number 0 
1353332894.
Oct  7 21:51:22 h1314631 mysqld[18836]: InnoDB: Doing recovery: scanned 
up to log sequence number 0 1353333356
Oct  7 21:51:22 h1314631 mysqld[18836]: 071007 21:51:22  InnoDB: 
Starting an apply batch of log records to the database...
Oct  7 21:51:23 h1314631 mysqld[18836]: InnoDB: Progress in percents: 40 
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 
89 90 91 92 93 94 95 96 97 98 99
Oct  7 21:51:23 h1314631 mysqld[18836]: InnoDB: Apply batch completed
Oct  7 21:51:23 h1314631 mysqld[18836]: 071007 21:51:23  InnoDB: 
Started; log sequence number 0 1353333356
Oct  7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [ERROR] Error 
reading master configuration
Oct  7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [ERROR] Failed 
to initialize the master info structure
Oct  7 21:51:24 h1314631 mysqld[18836]: 071007 21:51:24 [Note] 
/usr/sbin/mysqld: ready for connections.
Oct  7 21:51:24 h1314631 mysqld[18836]: Version: 
'5.0.45-Debian_1~bpo.1-debug'  socket: '/var/run/mysqld/mysqld.sock'  
port: 3306  Debian etch distribution

Maybe this helps somebody?

Regards,
Samy


Samuel Vogel schrieb:
> I did some more research:
>
> I ran "xfs_check" on the partition and it didn't report any errors at 
> all, so I'm guessing, that the hard drives are ok.
> I also recompiled MySQL with "--with-debug".
>
> How would I get some debug output about the crashes now?
>
> Thanks,
> Samy
>
> Samuel Vogel schrieb:
>> Hey guys,
>>
>> First of all: I have tried to post this Issue on forums.mysql.com, 
>> but the Topic doesn't appear in the Forum. When I try to post again, 
>> it tells me that it's a duplicate?!?!
>>
>> Now the real problem:
>> I have MySQL set up on two Servers with 7000 Users each and about the 
>> same amount of databases.
>> Since two days ago, there is corruption going on in many databases. 
>> And MySQL crashes every other hour or so.
>> Last night I updated from 5.0.32 to 5.0.45, but the problem is still 
>> there.
>>
>> Here is what I see in syslog:
>> Oct  7 11:02:53 h1314631 mysqld[32490]: 071007 11:02:53 [ERROR] 
>> /usr/sbin/mysqld: Table './10temulti@1-12/dzcp_counter_ips' is marked 
>> as crashed and should be repaired
>> Oct  7 11:02:53 h1314631 mysqld[32490]: 071007 11:02:53 [ERROR] 
>> /usr/sbin/mysqld: Table './10temulti@1-12/dzcp_counter_ips' is marked 
>> as crashed and should be repaired
>> Oct  7 11:03:23 h1314631 mysqld_safe[32724]: Number of processes 
>> running now: 0
>> Oct  7 11:03:23 h1314631 mysqld_safe[32729]: restarted
>> Oct  7 11:03:23 h1314631 mysqld[32734]: 071007 11:03:23  InnoDB: 
>> Database was not shut down normally!
>> Oct  7 11:03:23 h1314631 mysqld[32734]: InnoDB: Starting crash recovery.
>> Oct  7 11:03:23 h1314631 mysqld[32734]: InnoDB: Reading tablespace 
>> information from the .ibd files...
>> Oct  7 11:08:36 h1314631 mysqld[32734]: InnoDB: Restoring possible 
>> half-written data pages from the doublewrite
>> Oct  7 11:08:36 h1314631 mysqld[32734]: InnoDB: buffer...
>> Oct  7 11:08:36 h1314631 mysqld[32734]: 071007 11:08:36  InnoDB: 
>> Starting log scan based on checkpoint at
>> Oct  7 11:08:36 h1314631 mysqld[32734]: InnoDB: log sequence number 0 
>> 1346871925.
>> Oct  7 11:08:36 h1314631 mysqld[32734]: InnoDB: Doing recovery: 
>> scanned up to log sequence number 0 1346871925
>> Oct  7 11:08:36 h1314631 mysqld[32734]: 071007 11:08:36  InnoDB: 
>> Started; log sequence number 0 1346871925
>> Oct  7 11:08:38 h1314631 mysqld[32734]: 071007 11:08:38 [Note] 
>> /usr/sbin/mysqld: ready for connections.
>> Oct  7 11:08:38 h1314631 mysqld[32734]: Version: 
>> '5.0.45-Debian_1~bpo.1'  socket: '/var/run/mysqld/mysqld.sock'  port: 
>> 3306  Debian etch distribution
>>
>> As far as I understand, this means, that the MySQL Server crashed and 
>> mysqld_safe noticed that and restarted it.
>> I also see much database corruption, but I somewhat run into a 
>> chicken & egg problem here. I don't know if the databases corruption 
>> appeared first and led into the crashes or if the crashes led to the 
>> corruption.
>>
>> How can I investigate further into the problem? I don't think that a 
>> perticular query is crashing the system, since all of our users just 
>> run well known apps like phpBB etc.
>>
>> To clarify my situtaion I have just started a "myisamchk --silent 
>> --force --update-state --recover" for all tables that on my system.
>> It gives me, among errors it could repair, to error messages, which I 
>> could find much about with Google or the MySQL docs:
>>
>> myisamchk: error: 138 when opening MyISAM-table 
>> '/data/mysql/.../transcache.MYI'
>>
>> and
>>
>> myisamchk: Unknown error 126
>> myisamchk: error: '/data/mysql/.../smf_membergroups.MYI' doesn't have 
>> a correct index definition. You need to recreate it before you can do 
>> a repair
>>
>>
>> What makes me wonder too, is that the database corruption is 
>> happening on both Servers, but the MySQL crashes only appear on one 
>> of them.
>>
>> I'm running Debian Etch and the MySQL data dir is on an XFS 
>> partition. I have mounted the partition with "noatime".
>> How would I investigate a potential hard drive error?
>>
>> Can anybody shade some light on my situation?
>>
>> Regards,
>> Samy
>>
>
Thread
how to select second records in a groupSteve Kiehl6 Oct
  • Re: how to select second records in a groupBaron Schwartz6 Oct
  • MySQL Server crashing & database corruptionSamuel Vogel7 Oct
    • Re: MySQL Server crashing & database corruptionSamuel Vogel7 Oct
      • Re: MySQL Server crashing & database corruptionSamuel Vogel7 Oct
        • Re: MySQL Server crashing & database corruptionSamuel Vogel10 Oct