On Mon, 1 Aug 2005, Joerg Bruehe wrote:
> As a result, the allocation succeeds, but some process gets killed when
> the paging space cannot take such an additional page. To the affected
> process, this looks like a crash.
Linux 2.4 and 2.6 kernels have a setting for their overcommitment
behaviour under /proc/sys/vm/overcommit_memory. The different settings
0 - Heuristic overcommit handling. Obvious overcommits of
address space are refused. Used for a typical system. It
ensures a seriously wild allocation fails while allowing
overcommit to reduce swap usage. root is allowed to
allocate slighly more memory in this mode. This is the
1 - Always overcommit. Appropriate for some scientific
2 - Don't overcommit. The total address space commit
for the system is not permitted to exceed swap + a
configurable percentage (default is 50) of physical RAM.
Depending on the percentage you use, in most situations
this means a process will not be killed while accessing
pages but will receive errors on memory allocation as
Heuristic overcommit handling seems to be the default, and my problem is
in the 'Obvious overcommits of address space are refused'. For some (to me
unknown) reason the kernel looks at a single 7GB malloc as if it be an
'obvious overcommit' while 100 2GB mallocs (200GB total) is no problem. :P
For now I've set this to '2' which means the kernel won't overcommit
anymore, just like any other proper OS... ;-) This makes things far more
simple as I can only allocate as much memory as there is physically
available now. However it does force me to be a bit more conservative. I
have configured InnoDB with a 4GB buffer pool now, which leaves about 3GB
for connections (about 300 with my current MySQL settings). Now this seems
One final question though: my experience with InnoDB is that it really,
really likes to be able to fit all of it's data and keys into the buffer
pool. This would limit the maximum size of my database to roughly 4GB in
this case, correct? This is in a website hosting environment where the
database is hit with about 1000 queries/s (mixed read/write).
> I am a bit surprised that the Linux kernel management will only allocate
> memory if a single chunk of sufficient size is available. My
> understanding was that in a paging system this is not necessary.
> If this is (becoming) standard Linux policy, it might be necessary to
> demand memory piecewise. One drawback of this approach is increased
> bookeeping, if it ever needs to be released.
> I have no idea how the developers view this issue - you might open a
> change request if you consider this Linux kernel policy definite.
> You wrote that if a mysql server start fails, you can run "fillmem", and
> after its exit the memory will be available. I am not sure whether
> Rick's explanation addresses this issue as well - it might be the
> "memory defragger" he refers to. If not, the once used chunks might
> still be considered "active".
I think it all refers to the IMHO buggy (hey, even the manpages state it!)
VM memory allocation scheme. As stated I have disabled the overcommitment
behaviour for now, which seems to fit better to a dedicated database
Matthijs van der Klip
Spill E-Projects BV
1223 RE Hilversum