Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I'm happy to talk about this all day!

Oh really :)

I'd like to ask how applications should change their memory allocation or usage patterns to maximise the benefit of THP. Do memory allocators (glibc mainly) need config tweaking to coalesce tiny mallocs into 2MB+ mmaps, will they just always do that automatically, do you need to use a custom pool allocator so you're doing large allocations, or are you never going to get the full benefit of huge tables without madvise/libhugetlbfs? And does this apply to Mac/Windows/*BSD at all?

[Edit: ouch, I see /sys/kernel/mm/transparent_hugepage/enabled is default set to 'madvise' on my system (Slackware) and as a result doing nearly nothing. But I saw it enabled in the past. Well that answers a lot of my questions: got to use madvise/libhugetlbfs.]

I read you also need to ensure ELF segments are properly aligned to get transparent huge pages for code/data.

Another question. From your link [2]:

> An application may mmap a large region but only touch 1 byte of it, in that case a 2M page might be allocated instead of a 4k page for no good.

Do the heuristics used by Linux THP (khugepaged) really allow completely ignoring whether pages have actually been page-faulted in or even initialised? Is a possibility unlikely to happen in practice?



In current Linux systems, there are two main ways to benefit from huge pages. 1) There is the explicit, user-managed approach via hugetlbfs. That's not very common. 2) Transparently managed by the kernel via THP (userpsace is completely unaware and any application using mmap() and malloc() can benefit from that).

As I mentioned before, most major Linux distributions ship with THP enabled by default. THP automatically allocates huge pages for mmap memory whenever possible (that is when, at least, the region is 2 MiB aligned and is at least 2 MiB in size). There is also a separate kernel thread, khugepaged, that opportunistically tries to coalesce/promote base 4K pages into 2 MiB huge pages, whenever possible.

Library support is not really required for THP, but could be detrimental for its performance and availability on the long run. A library that is not aware of kernel huge pages may employ suboptimal memory management strategies, resulting in inefficient utilization, for example by unintentionally breaking those huge pages (e.g. via unaligned unmapping), or failing to properly release them to the OS as one full unit, undermining their availability on the long run. Afaik, Tcmalloc from Google is the only library with extensive huge page awareness [1].

> Do the heuristics used by Linux THP (khugepaged) really allow completely ignoring whether pages have actually been page-faulted in or even initialised? Is a possibility unlikely to happen in practice?

Linux allocates huge pages on first touch. For khugepaged, it only coalesces the pages if all the base pages covering the 2 MiB virtual region exist in some form (not necessary faulted-in. For example, some of those base pages could be in swap space and Linux will first fault them in then migrate them)

[1] https://www.usenix.org/system/files/osdi21-hunter.pdf


Mimalloc has support for huge pages. It also has an option to reserve 1GB pages on program start, and I've had very good performance results using that setting and replacing factorio's allocator. On windows and linux.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: