You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are these limitations raised by Haiku devs about the original rpmalloc, it is interesting to fix limitations of the original rpmalloc in the fork.
I started a discussion on the Haiku forum about c-raii:
I will create this ticket here in c-raii and it will also be used for rpmalloc fork, so as not to have duplicate tickets.
These are the limitations of the original rpmalloc noted by the Haiku devs in this ticket.
This excerpt from the comment is taken from this Haiku ticket:
"With rpmalloc we had found at least the following problems:
It did not support very large alignments (such as allocations aligned to B_PAGE_SIZE). The author added them on our request.
It had memory fragmentation problems. Where hoard2 creates a single large area upfront and uses it, rpmalloc starts with a smaller area and allocates more small areas when there are new allocations. This results in a fragmented memory space, and, especially on 32bit systems, after a while, your app has a lot of small areas all over its address space, and not enough contiguous memory space to fit a large mmap or create_area
rpmalloc also wasted a lot of memory, that was reserved by the apps but not actually allocated. It seems it relied on the OS not actually mapping the RAM immediately, and filling it later as needed. Which is doable, but rpmalloc did not do anything special to request that from the OS, since on Linux, that seems to be the default way to allocate memory. In our case, explicit management would be needed, as well as error handling if the OS runs out of physical memory to allocate.
Creating many areas also wasn't helping performance, since managing a lot of areas was slow (maybe that improved a bit since then). It turns out reserving a large initial memory space at the start, and mapping physical pages to it as needed, ends up being simpler. As a sidenote, having many areas for malloc for each app in listareas and other tools also was not very convenient for debugging
Maybe some of these problems are specific to rpmalloc.
It also raises some questions on underlying features: how we manage the virtual address space of applications (do we have a specific allocation strategy here? There is room for something to be done to have an efficient allocator that also does not fragment the memory space too much, for example a look at TLSF malloc would be interesting here, it is designed to handle such things on systems without an MMU where fragmentation is more critical). How does our rather aggressive ASLR come into play in fragmenting the address space, especially on 32bit systems?
In any case, an experiment like this will need a lot of testing on various configurations, to make sure it handles normal situations, low memory situations, and all special cases that can show up (apps needing a large non-malloc address space, for example). That's, of course, in addition to performance tests in terms of memory use and speed, which we need to run ourselves with our own usecases (experience shows that the benchmarks done by the allocator developers are always biased towards their own specific usecases)."
There are comments in this ticket about choosing the system allocator for Haiku.
Here are links to the original rpmalloc discovery and implementation in Haiku:
There are these limitations raised by Haiku devs about the original rpmalloc, it is interesting to fix limitations of the original rpmalloc in the fork.
I started a discussion on the Haiku forum about c-raii:
https://discuss.haiku-os.org/t/c-raii-as-new-system-allocator-for-haiku/15459
I will create this ticket here in c-raii and it will also be used for rpmalloc fork, so as not to have duplicate tickets.
These are the limitations of the original rpmalloc noted by the Haiku devs in this ticket.
This excerpt from the comment is taken from this Haiku ticket:
"With rpmalloc we had found at least the following problems:
It did not support very large alignments (such as allocations aligned to B_PAGE_SIZE). The author added them on our request.
It had memory fragmentation problems. Where hoard2 creates a single large area upfront and uses it, rpmalloc starts with a smaller area and allocates more small areas when there are new allocations. This results in a fragmented memory space, and, especially on 32bit systems, after a while, your app has a lot of small areas all over its address space, and not enough contiguous memory space to fit a large mmap or create_area
rpmalloc also wasted a lot of memory, that was reserved by the apps but not actually allocated. It seems it relied on the OS not actually mapping the RAM immediately, and filling it later as needed. Which is doable, but rpmalloc did not do anything special to request that from the OS, since on Linux, that seems to be the default way to allocate memory. In our case, explicit management would be needed, as well as error handling if the OS runs out of physical memory to allocate.
Creating many areas also wasn't helping performance, since managing a lot of areas was slow (maybe that improved a bit since then). It turns out reserving a large initial memory space at the start, and mapping physical pages to it as needed, ends up being simpler. As a sidenote, having many areas for malloc for each app in listareas and other tools also was not very convenient for debugging
Maybe some of these problems are specific to rpmalloc.
It also raises some questions on underlying features: how we manage the virtual address space of applications (do we have a specific allocation strategy here? There is room for something to be done to have an efficient allocator that also does not fragment the memory space too much, for example a look at TLSF malloc would be interesting here, it is designed to handle such things on systems without an MMU where fragmentation is more critical). How does our rather aggressive ASLR come into play in fragmenting the address space, especially on 32bit systems?
In any case, an experiment like this will need a lot of testing on various configurations, to make sure it handles normal situations, low memory situations, and all special cases that can show up (apps needing a large non-malloc address space, for example). That's, of course, in addition to performance tests in terms of memory use and speed, which we need to run ourselves with our own usecases (experience shows that the benchmarks done by the allocator developers are always biased towards their own specific usecases)."
There are comments in this ticket about choosing the system allocator for Haiku.
Here are links to the original rpmalloc discovery and implementation in Haiku:
https://git.haiku-os.org/haiku/commit/?h=hrev53136&id=c8836afc0a2bebbfb34b7d71448cc372bdeea972
https://git.haiku-os.org/haiku/log/?qt=range&q=hrev53135..hrev53136
The text was updated successfully, but these errors were encountered: