4xQuad Core Xeon
Local Disks 6x73GB SAS
server behaves normally, load between 1 and 2, but has cpu load spikes
during the day, 3 or 4, and if the load goes over 20, it will go in a
death spiral collapsing,
I have enough memory ,buffers and disk.
my investigations it seems related to queries that go on disk
(temporary tables) and then the server becomes a PCXT with connections
backlogging until death,
but it really seems a bug, I cant imagine a server collapsing like this.
The load of the server is not extremely high an average of 1000 statements per second, 30 connections, 150k rows accesses/sec.
I saw from the vm stat is that in the moment of the deadlock the number
of context switches becomes huge and processes waiting for cpu from
almost 0 grow up to 100.
Also it seems from 'avm' value that the server is actively using swap, but from top it says 4gb of swap and free!
The strange thing is that the sort buffer is very big and it stills goes on disk to sort.
Are the per-connection buffers too big?
I attach some statistics, dinner paid for any clue!
There is IOSTATS, TOP, and VMSTATS for two deadlock issues happened two consequent nights one at 22:06 one at 22:24