Optimizing memory allocation for custom allocators and non C code

I had this on my TODO list for a while, but the recent introduction of inaccessiblememonly makes it suddenly more urgent, as there is a risk to waste effort in duplicated work and/or end up with suboptimal solutions. I collected 2 use cases for inaccessiblememonly :

  • Allocation like functions.
  • Runtime functions for managed languages, that touch state that the program itself can never touch directly.

My initial reflection was that MemoryBuiltins use a set of hardcoded functions to do its magic. Doing so, it support the C API fairly well, some variation of the operator new, and that’s it. It seems unlikely and counter productive that all language dump their runtime in there, and won’t work when feature like C++ templates comes in.

It seemed to me that adding attribute for allocation like function would be useful. I think this road is superior to the inaccessiblememonly when it come to memory allocation for the following reason:

  • If the allocator can be exposed (custom allocator use case) inaccessiblememonly is not usable while this is.

  • Other allocation related optimizations can kick in (for instance InstCombiner::visitAllocSite).

I think it is fair to keep this attribute for the managed language use case, for instance, to improve GlobalsAA, but we should definitively restrict it to function that are declared but NOT defined. When merging modules, if a function with the attribute becomes defined, then it needs to be thrown out. I don’t think it would be that hard to do in practice, and would greatly improves usability of inaccessiblememonly by making it safe to merge modules.

Thought ?

I had this on my TODO list for a while, but the recent introduction of inaccessiblememonly makes it suddenly more urgent, as there is a risk to waste effort in duplicated work and/or end up with suboptimal solutions. I collected 2 use cases for inaccessiblememonly :
- Allocation like functions.
- Runtime functions for managed languages, that touch state that the program itself can never touch directly.

My initial reflection was that MemoryBuiltins use a set of hardcoded functions to do its magic. Doing so, it support the C API fairly well, some variation of the operator new, and that's it. It seems unlikely and counter productive that all language dump their runtime in there, and won't work when feature like C++ templates comes in.

I've been looking at this some over the last week and have been trying to separate the component properties MemoryBuiltins provides. So far, I have the following distinct properties:
1) Aliasing - malloc and calloc are assumed not to modify any module visible value in MemoryDependenceAnalysis. I believe this should be replaceable with inaccessiblememonly.
2) Observability - We assume in InstCombine that allocation itself is not directly observable. That is, we will remove an allocation site which is not otherwise used. We assume the same for free.
3) Infinite abstract heap - We assume that malloc only fails if out of memory and that out of memory is not a observable condition. In particular, we will fold null checks to an unused malloc to false meaning that OOM may fail to be observed.
4) Nullability - Operator new was hardcoded to return null. This is split out in http://reviews.llvm.org/D15820 which I'll be submitting shortly.
5) noalias - We assume that allocation returns a noalias pointer and that stores to that location are not observable unless the pointer is captured.
6) Malloc/Free pairing - We assume that using the incorrect form of free is UB and thus we can pretend the proper form was used for all of the preceding.

I think it's a bit too early to settle on a new attribute at this time. I want to make each of the properties above explicit in the code and once that's done, we can see which of them are worth promoting to full blown attributes.

Seems fairly complete list of the properties that currently used by the optimizers.
I can remember of two more:
- Allocated size: MemoryBuiltins knows how to compute the size of an allocated object by malloc/new/strdup/etc. This information is used for e.g. aliasing, lowering __builtin_object_size, bounds checks instrumentation, etc. It would be interesting to generalize this support for custom allocators.
- Initialization: LLVM knows that calloc/strdup/.. return initialized memory, while malloc doesn't. A load from a malloc'ed pointer is folded to undef.

Nuno

Do you think all of them deserve their own check ? Some of them already have a way to be expressed in the IR (no alias return for instance) but others seems like a pack of behavior that goes together for the most part.

Honestly, I’m not sure yet. Some of them might make sense to promote to attributes - since it can enable some interesting inter-procedural analysis - but others may not. I do want to make sure we update the documentation and clarify what the existing predicates mean as a minimum though. My current feeling is that (1,4,5, and possibly Nuno’s initialized memory) should be more generic with the rest bundled under better described predicates. However, that’s evolving as I go. :slight_smile: I started on this because I discovered that we’d accidentally duplicated a fair amount of logic for our custom allocation functions not knowing some of the existing hooks existed. When I went looking to see if we could share code, I was left with great uncertainty about what the existing code actually did. :slight_smile: