Alternative titles: Does LLVM do optimizations that are incompatible with mmap? Can we document that particular uses of mmap are fine with LLVM?
One reasonable use of mmap is to implement something that is equivalent to an allocation that grows and shrinks – first reserve a chunk of address space (but don’t map any pages there), and then use more mmap calls to allocate and free pages inside that range as memory demand comes and goes. I am assuming here that pages only get (un)mapped at the end – i.e., there’s some contiguous set of pages starting at the base address that was returned by the initial reservation, and the only thing that changes is the end of this region of mapped pages. (It is certainly interesting to discuss generalizations of this where pages can also be unmapped at the beginning or in the middle, but let’s not get ahead of ourselves.) To my knowledge, C programmers will generally assume that this is something one can do, and we’d like to officially support it in Rust as well.
The trouble is, it is unclear whether LLVM is actually compatible with this kind of pattern. LLVM obvously doesn’t know anything about mmap, but LLVM does know about the concept of an “allocation” or “allocated object” – and if we want to freely access the currently mapped range of our growing-and-shrinking allocation, we need to be careful not to run afoul of assumptions LLVM is making here.
(Can we avoid this region being one large allocated object that changes its size? I don’t think so. Allocated objects have to be backed by memory that’s actually mapped – LLVM assumes we can load from an allocated object any time and that will never trap. We could consider each mmap that actually maps pages to create its own allocated object, but that would mean we cannot do pointer arithmetic across the boundaries of these blocks, which is not an acceptable restriction.)
I was told before by @nlopes that in fact LLVM does not support this pattern. If that is indeed the current consensus then I would ask what it would take to make LLVM support this pattern – as I said above, this is something we’d really like to support in Rust. Currently we have to tell people that they can only use mmap in very restrictive ways, which is unfortunate, but we’re apparently limited by LLVM.
There are two places where I see potential issues.
getelementptr inbounds
This operation requires the pointer to remain in-bounds of some allocated object. If allocated objects can change their size, then does this mean they have to be in-bound of the current size of the object? That means moving getelementptr inbounds down across a shrinking operation would be incorrect, as the offset may not be inbounds any more after shrinking. Maybe it has to be inbounds of the largest size the object ever had? Then moving getelementptr inbounds down will always be possible, but moving it up across a growing operation would not work.
Or maybe the correct way to view this is something entirely different, like – allocated objects correspond to reservations in the address space that may or may not actually be backed by pages right now, and the fact that an address is inbounds of a live allocated object does not imply that the address is actually mapped?
This would be elegantly solved by using a “nowrap” operation for pointer arithmetic instead of an “inbounds” operation, but that’s a larger change that so far didn’t get much traction. Still, it would very easily answer almost all subtle questions about getelementptr inbounds, giving it clear semantics while only requiring minimal changes to existing optimizations (if I recall correctly).
dereferenceability
Consider the following pseudocode:
%val1 = load i32, ptr %ptr // %ptr is dereferenceable for 4 bytes here
%_ = call ...
%val2 = load i8, ptr %ptr
if (...) {
%val3 = load i32, ptr %ptr
...
}
Can val3 be hoisted out of the if?
- If allocations can never shrink, then the answer is yes – the fact that after the call,
%ptrremains dereferenceable for one byte implies that the allocation has not been freed so it is still dereferenceable for 4 bytes. - However, if the
callshrinks the allocation, then maybe after the call,%ptrpoints to 1 byte of still-existing memory, followed by a page boundary and 3 bytes on a page that no longer exists, so hoistingval3would introduce a page fault.
Does LLVM currently do any sort of reasoning like this, where dereferenceability is carried across function calls based on “a big access happened before the call and a small access after the call, therefore the big access must also still be legal after the call”? If yes, then that seems fundamentally incompatible with the idea of shrinking an allocation, and therefore with unmapping no-longer-needed pages at the end of such a growing-and-shrinking allocation.
If LLVM does not currently do such reasoning, can we get this documented in the LangRef so that frontends can rely on it?