MySQL Lists are EOL. Please join:

List:General Discussion« Previous MessageNext Message »
From:Antoine Date:November 26 2002 4:27pm
Subject:possible problems with FLUSH TABLES WITH READ LOCK
View as plain text  

I am using FLUSH TABLES WITH READ LOCK to get consistent
snapshots of my database without shutting it down.

The setup is :

- bi-P4 Xeon with Redhat 7.3
- 2.4.19 kernel with properly patched LVM (compiled from source)
- MySQL server 4.0.4 (compiled from source)
- ext3 filesystem on a 44 GB LVM logical volume named /dev/vgdata/data
- the whole database is 20 GB in size
- all tables are MYISAM ; some with dynamic records, some fixed,
some compressed
- some tables - not all - are created with DELAYED_KEY_WRITE=1 to get
more speed (30% faster thanks to this)

The backup sequence is :

- lvcreate -L 5G -c 256k -s -n backup /dev/vgdata/data
  (this creates a 5GB snapshot volume named "backup" from the logical volume
  containing the database)
- mount /dev/vgdata/backup /backup -oro,noatime
- cd /backup/ ; tar cvf /dev/st0 *

Today I've tried restoring a backup on a test partition just to see
(you're never too careful). Restoring is OK (of course) but when I run
"myisamchk -c *.MYI", just to be sure, I get various kinds of errors, on
some tables but not all. Common errors include :

"1 clients is using or hasn't closed the table properly"
"error: Size of indexfile is: 17404928        Should be: 17507328"
"warning: Size of datafile is: 24922872        Should be: 24896378"
"error: Found 185731 keys of 186676"
"error: Found key at page 1024 that points to record outside datafile"

In fact, all kind of errors that you'd expect to find if you copy your
files without doing a "FLUSH TABLES WITH READ LOCK" first. Thus I
was wondering if the latter command does work properly. Is it likeky
to be due to :

- SMP problems ? (it has hyperthreading enabled, BTW, but this shouldn't
make any further difference : it just sees 4 logical CPUs instead of 2)
- DELAYED_KEY_WRITE ? (but some tables that aren't created as such have
problems too, so this shouldn't be the _only_ problem)
- specific Linux locking behaviour wrt flushing & locking tables ?
- Linux LVM bug ? (unlikely in my opinion, it seems heavily used)
- other... ?

Please note : tables are written to in a continuous way, so it's no
surprise many tables get corrupted if the lock is not absolutely
consistent and fail-proof ;))

Also, I know the backup volume is large enough (I print the occupied
size at the end of the backup procedure).

Well, of course, it may just be the backup tape itself that was screwed
up, but it doesn't seem very likely, at least in my opinion, otherwise
un-tar-ing it should have failed somewhere.

What do you think ?

Thank you



possible problems with FLUSH TABLES WITH READ LOCKAntoine26 Nov