Frank,
We used the 64-bit source to compile 4.0.20, and we used the 32-bit
binaries.
The problem was tracked down at about 1am - it was the kernel (or the
SCSI drivers). We put a 3Ware SATA Raid-5 card in, and all the crashes
went away.
There are 64-bit binaries, but we had some problems with them (the guy
that initially tried them can't remember the exact issue). You need to
had a -fPic flag to get them to compile for the Opteron.
The PIC flag is for position-independant code. Google it with Opteron
and you'll see a bunch of posts on it.
David
Dr. Frank Ullrich wrote:
> David,
>
>
> David Griffiths wrote:
>
>> We are in the process of setting up a new MySQL server. It's a
>> dual-Opteron (Tyan Thunder K8S motherboard) with 6 gig of DDR333 RAM
>> (registered) and an LSI SCSI card with 6 SCSI drives (5 in a RAID-5
>> array, with one hot-spare) running SuSE Enterprise 8.1 (64-bit).
>>
>> I loaded all our data (about 2 gig) into the database back on
>> Tuesday, and created the indexes without issue, as a test to see how
>> long it would take.
>>
>> Tonight, we were going to cut over to this new machine. I was setting
>> up data as a test run, and started coming across "Database page
>> corruption on disk or a failed file read of page" errors.
>>
>> At first, we were using MySQL 4.0.20 64-bit, compiled from source by
>> us (the -fPic option needs to be included in the Makefile, and for
>> some reason isn't in the binaries - also, no release notes for the AMD64
>
>
> So you can't use the binaries that MySQL provides and therefore you
> didn't test them? Or did you?
> Why is this -fPic option important?
> I'm curious because we have a dual opteron system too and I wanted to
> install the 64bit binary (4.0.20-standard) from the MySQL web site.
>
> Regards,
> Frank.
>
>
>> platform at http://dev.mysql.com/doc/mysql/en/Linux.html).
>>
>> I could consistently crash the database by creating an index on a
>> column (a varchar(50)). I could also crash it doing a "SELECT
>> COUNT(*)..." from a table with 3 million rows. Unfort, I did not save
>> the crash-log.
>>
>> We rolled back to 4.0.18, also 64-bit. Exactly the same issue. Here's
>> the output.
>>
>>
> -------------------------------------------------------------------------------------------------------------
>
>>
>> InnoDB: Database page corruption on disk or a failed
>> InnoDB: file read of page 12244.
>> InnoDB: You may have to recover from a backup.
>> 040624 17:21:59 InnoDB: Page dump in ascii and hex (16384 bytes):
>> ...
>> 040624 17:21:59 InnoDB: Page checksum 1484130208,
>> prior-to-4.0.14-form checksum 1108511089
>> InnoDB: stored checksum 2958040096, prior-to-4.0.14-form stored
>> checksum 1108511089
>> InnoDB: Page lsn 0 204702464, low 4 bytes of lsn at page end 204702464
>> InnoDB: Page may be an index page where index id is 0 24
>> InnoDB: and table yw/boats2 index PRIMARY
>> InnoDB: Database page corruption on disk or a failed
>> InnoDB: file read of page 12244.
>> InnoDB: You may have to recover from a backup.
>> InnoDB: It is also possible that your operating
>> InnoDB: system has corrupted its own file cache
>> InnoDB: and rebooting your computer removes the
>> InnoDB: error.
>> InnoDB: If the corrupt page is an index page
>> InnoDB: you can also try to fix the corruption
>> InnoDB: by dumping, dropping, and reimporting
>> InnoDB: the corrupt table. You can use CHECK
>> InnoDB: TABLE to scan your table for corruption.
>> InnoDB: Look also at section 6.1 of
>> InnoDB: http://www.innodb.com/ibman.html about
>> InnoDB: forcing recovery.
>> InnoDB: Ending processing because of a corrupt database page.
>>
>>
> -------------------------------------------------------------------------------------------------------------
>
>>
>>
>> InnoDB is robust enough to recover, fortunately.
>>
>> Then we thought it might be an issue with the 64-bit version, so we
>> installed the 32-binary version (we didn't compile it) of 4.0.20.
>>
>> I managed to make it crash in exactly the same way - adding an index
>> to a table, dropping an index, or selecting a count from the same
>> large table.
>>
> -------------------------------------------------------------------------------------------------------------
>
>>
>> 040624 20:29:07 mysqld restarted
>> 040624 20:29:08 InnoDB: Database was not shut down normally.
>> InnoDB: Starting recovery from log files...
>> InnoDB: Starting log scan based on checkpoint at
>> InnoDB: log sequence number 0 3576655719
>> InnoDB: Doing recovery: scanned up to log sequence number 0 3576655719
>> 040624 20:29:08 InnoDB: Flushing modified pages from the buffer pool...
>> 040624 20:29:09 InnoDB: Started
>> /usr/local/mysql/bin/mysqld: ready for connections.
>> Version: '4.0.18-standard-log' socket: '/tmp/mysql.sock' port: 3306
>> InnoDB: Database page corruption on disk or a failed
>> InnoDB: file read of page 23235.
>> InnoDB: You may have to recover from a backup.
>> 040624 20:29:38 InnoDB: Page dump in ascii and hex (16384 bytes):
>>
>> 040624 20:29:38 InnoDB: Page checksum 1229875638,
>> prior-to-4.0.14-form checksum 4263044155
>> InnoDB: stored checksum 2727822450, prior-to-4.0.14-form stored
>> checksum 4263044155
>> InnoDB: Page lsn 0 748566710, low 4 bytes of lsn at page end 748566710
>> InnoDB: Page may be an index page where index id is 0 15
>> InnoDB: and table yw/boats_clobs2 index PRIMARY
>> InnoDB: Database page corruption on disk or a failed
>> InnoDB: file read of page 23235.
>> InnoDB: You may have to recover from a backup.
>> InnoDB: It is also possible that your operating
>> InnoDB: system has corrupted its own file cache
>> InnoDB: and rebooting your computer removes the
>> InnoDB: error.
>> InnoDB: If the corrupt page is an index page
>> InnoDB: you can also try to fix the corruption
>> InnoDB: by dumping, dropping, and reimporting
>> InnoDB: the corrupt table. You can use CHECK
>> InnoDB: TABLE to scan your table for corruption.
>> InnoDB: Look also at section 6.1 of
>> InnoDB: http://www.innodb.com/ibman.html about
>> InnoDB: forcing recovery.
>> InnoDB: Ending processing because of a corrupt database page.
>>
> -------------------------------------------------------------------------------------------------------------
>
>>
>>
>> I am pretty confident that it is not MySQL or InnoDB. But I am at a
>> loss.
>>
>> We ran memtest86 on the machine to see if it was memory, and played
>> musical chairs with the memory to see if we had a bad DIMM. We tried
>> disabling the cache on the SCSI card in case it had bad RAM.
>>
>> Does anyone have any suggestions?
>>
>> David
>>
>>
>>
>>
>