The times I have had to explain how mm works is draining. yes you can malloc 2M,...

dataflow · on Feb 12, 2021

Well, it does mean that in C. But some folks prefer to play by their own rules.

barnacled · on Feb 12, 2021

Actually no, the malloc doesn't allocate any memory it just updates the process's VMA to say that the allocated virtual range is valid. The pages are then faulted in on write. This is where things like OOM killer become very confusing for people.

In linux (in sane configurations) allocations are just preorders.

EDIT: I can't reply below due to rate limiting:

I'd argue that overcommit just makes the difference between allocation and backing very stark.

Your memory IS in fact allocated in the process VMA, it's just the anonymous pages cannot necessarily be backed.

This differs, obviously, in other OSes as pointed out. Also differs if you turn overcommit off but since so much in linux assumes it your system will soon break if you try it.

wahern · on Feb 12, 2021

This depends on the OS. Solaris and Windows both do strict accounting by default, and overcommit is opt-in at a fine-grain API level. Linux is relatively extreme in its embrace of overcommit. So extreme that strict accounting isn't even possible--even if you disable overcommit in Linux, there are too many corner cases in the kernel where a process (including innocent processes) will be shot down under memory pressure. Too many Linux kernel programmers designed their subsystems with the overcommit mentality. That said, I still always disable overcommit as it makes it less likely for innocent processes to be killed when under heavy load.

An example of a split-the-difference approach is macOS, which AFAIU implements overcommit but also dynamically instantiates swap so that overcommit-induced OOM killing won't occur until your disk is full.

Also, it's worth mentioning that on all these systems process limits (see, e.g., setrlimit(2)) can still result in malloc returning NULL.

dataflow · on Feb 13, 2021

> Solaris and Windows both do strict accounting by default, and overcommit is opt-in at a fine-grain API level.

Not sure what you mean by this - I don't think Windows has overcommit in any form, whether opt-in or opt-out. What it does have is virtual address space reservation, but that's separate from commitment; reserved virtual memory is not backed by any page, no matter how much free RAM you have, until you explicitly tell the system to commit physical memory to it.

In fact I'm not even sure 'opt-in' to overcommit is possible in principle. Because if you opt-in to overcommit, you jeopardize other applications' integrity—who likely did not opt-in.

wahern · on Feb 13, 2021

I thought there was a flag or commonly used library function that would do VirtualAlloc(MEM_RESERVER) and then from an in-process page fault handler attempt VirtualAlloc(MEM_COMMIT). But I guess I was wrong? I assume it's possible, just not as common as I thought.

dataflow · on Feb 13, 2021

I don't know of a common (or uncommon) function like this, though I think you could indeed implement it if you really want to (likely via AddVectoredExceptionHandler). It still requires explicitly telling the OS to commit just-in-time, so it's not "overcommitting". The closest built-in thing to this that I know of is PAGE_GUARD, which is internally used for stack extension, but that's all I can think of. The use cases for such a thing would be incredibly niche though—like kind of high-performance sparse page management where every single memory access instruction counts. Like maybe if you're writing a VM or emulator or something. Something that's only appropriate for << 1% of programs.

dataflow · on Feb 12, 2021

I said "in C". You're talking "in Linux" (or glibc/whatever). Which, as I already said, plays by its own rules and defies C. It's broken by design.

rrss · on Feb 13, 2021

I don't think the c standard specifies this behavior. malloc must return either a pointer where you can store an object, or null. I think platform details about when accesses to that pointer might fail are outside the scope of the language / stdlib standard.

Are failures when accessing the allocated pointer due to overcommit substantially different than failures due to ECC errors or other hardware failure, with regard to what is specified in the c standard?

(FWIW I don't particularly like overcommit-by-default either)

AnimalMuppet · on Feb 13, 2021

So if I malloc 2 MB or 2 GB or whatever in a C program running on Linux, but I have not yet either read from or written to that memory, then what's the state? Has the C library forced Linux to actually allocate it, or has it not? Or does it depend, and if so, on what?

dataflow · on Feb 13, 2021

It depends on the overcommit setting. By default it's on, and that indicates Linux doesn't promise to back it with a physical page. Only the virtual address range is allocated (i.e. the only guarantee is that future allocations within your process won't return addresses from that range). This implies that if you try to write to it, your write might segfault due to OOM. If overcommit is turned off, then Linux promises it will be backed with a physical page if you try to it, meaning your write won't segfault due to OOM. Aside from these, I think everything else is an implementation detail, but generally OSes map unwritten pages to the same zero page as an optimization, and then when a write occurs they back it with a physical page.

wahern · on Feb 12, 2021

> Also differs if you turn overcommit off but since so much in linux assumes it your system will soon break if you try it.

I agree, reliance on overcommit has resulted in stability problems in Linux. But IME stability problems aren't induced by disabling overcommit, they're induced by disabling swap. The stability problems occur precisely because by relying on magical heuristics to save the day, we end up with an overall MM architecture that reacts extremely poorly under memory pressure. Whether or not overcommit is enabled, physical memory is a limited resource, and when Linux can't relieve physical memory pressure by dumping pages to disk, bad things happen, especially when under heavy I/O load (e.g. the buffer cache can grab pages faster than the OOM killer can free them).

quotemstr · on Feb 12, 2021

And that's why strict allocation tracking (no overcommit) should be the default. But those of us in favor of guaranteed forward progress and sensible resource accounting lost this fight a long time ago.

jdsully · on Feb 12, 2021

In the C standard malloc should return null if it can’t fulfill the request. Linux violates this but it usually works out in the end since virtual memory makes true OOM very rare.

ymbeld · on Feb 13, 2021

I don’t know what true OOM means, but my desktop has crashed I think at least three times in the last four months and the console said “OOM killer”. About 15GB of usable RAM, 2GB swap drive. I just have to have the usual applications open plus another browser in addition to Firefox, namely Chrome. (But naturally I don’t try to actively reproduce the behavior since I usually have better things to do than wait 10 minutes from everything becoming unresponsive -- even switching from the graphical session to a console -- to the OOM killer finally deciding to kill Chrome.) And I don’t run any virtual machines, just a big, fat IDE and stuff like that.

ben_bai · on Feb 13, 2021

Your problem is the 2GB of swap. Get rid of it and it will just crash without 10min of slowdown (while swap disk is getting written to). </sarcasm>

Linux overcommitting memory and especially chrome/firefox beeing big-fat-memory-hogs are the problem. In fact every application which doesn't cope malloc beeing out of memory or assuming everybody has multiple gigs of memory to spare should "reevaluate".

ymbeld · on Feb 14, 2021

Seriously though that’s a good idea. Might be better to just disable swap. :) Well, at least until I go out and buy more RAM.

ben_bai · on Feb 18, 2021

The soft way would be to set ulimit for memory to something other than unlimited. To cap the max mem limit per process

ymbeld · on Feb 19, 2021

Thanks, that’s a good suggestion.