malloc vs malloc

I discovered that LLVM's malloc only allows a 32-bit size argument, so you
cannot use it to allocate huge blocks on 64-bit machines. So I considered
replacing all of my uses of LLVM's malloc instruction with calls to the libc
malloc function instead. That got me wondering why LLVM even has its own
malloc intrinsic anyway...

Am I correct in assuming that LLVM's malloc intrinsic exists so that some
optimization passes can rewrite it, e.g. replacing heap allocation with stack
allocation when no part of the allocated value escapes scope? So replacing
all of my uses of LLVM's malloc with libc's malloc might hamper LLVM's
optimizations and degrade performance?

Hi Jon,

There is no good reason for malloc to be an instruction anymore. I'd be very happy if it got removed. Even if we keep it, malloc/alloca should be extended to optionally take 64-bit sizes.

-Chris

Chris Lattner wrote:

I discovered that LLVM's malloc only allows a 32-bit size argument, so you
cannot use it to allocate huge blocks on 64-bit machines. So I considered
replacing all of my uses of LLVM's malloc instruction with calls to the libc
malloc function instead. That got me wondering why LLVM even has its own
malloc intrinsic anyway...

Am I correct in assuming that LLVM's malloc intrinsic exists so that some
optimization passes can rewrite it, e.g. replacing heap allocation with stack
allocation when no part of the allocated value escapes scope? So replacing
all of my uses of LLVM's malloc with libc's malloc might hamper LLVM's
optimizations and degrade performance?

Hi Jon,

There is no good reason for malloc to be an instruction anymore. I'd be very happy if it got removed. Even if we keep it, malloc/alloca should be extended to optionally take 64-bit sizes.

I'm curious. Do we want to keep the free instruction?

Nick

There is no good reason for malloc to be an instruction anymore. I'd
be very happy if it got removed. Even if we keep it, malloc/alloca
should be extended to optionally take 64-bit sizes.

I'm curious. Do we want to keep the free instruction?

No, there's no reason to.

-Chris

There still are reasons to have it; just grep around for FreeInst. Function
attributes are not yet sufficient to replace all of those yet.

And if the ailgnment attribute on MallocInst were implemented, perhaps
via posix_memalign or other target-specific mechanisms, then MallocInst
would also have a reason to be kept.

Dan

isa<FreeInst>(X) can be replaced with:

bool isFree(Instruction *X) {
   if (CallInst *CI = dyn_cast<CallInst>(X))
     if (Function *F = CI->getCalledFunction())
       if (F->isName("free") && F->hasExternalLinkage())
         return true;
   return false;
}

There is no need to have an actual IR object for it. posix_memalign, calloc, valloc and others are all great reasons why we shouldn't have a MallocInst either.

-Chris

Chris Lattner wrote:

There is no good reason for malloc to be an instruction anymore.
I'd
be very happy if it got removed. Even if we keep it, malloc/alloca
should be extended to optionally take 64-bit sizes.

I'm curious. Do we want to keep the free instruction?

No, there's no reason to.

There still are reasons to have it; just grep around for FreeInst.
Function
attributes are not yet sufficient to replace all of those yet.

And if the ailgnment attribute on MallocInst were implemented, perhaps
via posix_memalign or other target-specific mechanisms, then MallocInst
would also have a reason to be kept.

isa<FreeInst>(X) can be replaced with:

bool isFree(Instruction *X) {
   if (CallInst *CI = dyn_cast<CallInst>(X))
     if (Function *F = CI->getCalledFunction())
       if (F->isName("free") && F->hasExternalLinkage())

Surely you mean "llvm.free" or something, right? I don't think we want to start assigning meaning to otherwise arbitrary function names.

Nick

No, I mean "free". Dan Gohman and I discussed this today. A short version in no particular order is:

1) in practice, IMO, everyone building for user space has libc in their address space, even if they are supporting some non-C language like haskell or ocaml or something.
2) in practice, IMO, if an app is in kernel space or embedded, if they use a symbol (e.g. printf) it almost certainly does what ansi says it does.
3) I really think the functionality of simplifylibcalls should be moved into instcombine someday, so that it is run more than once and is iterative.
4) we want other passes to be able to optimize libcalls. For example, the stuff I added in r61918 requires GVN or memcpyopt to optimize things like strlen.
5) regardless of #1/2, we want to support things like -fno-builtin-free someday. -ffree-standing would also imply this sort of thing, as would -fno-builtins etc.
6) there are other routines that we want to make assumptions about, that are almost certainly true, but are not standard conformant. I'm thinking things like "operator new is sane and thus the result is noalias". This specific example should be controlled by something like -foperator-new-is-sane, and probably default to on.

To me, there are two solutions. One bad solution, for various reasons, is to create an llvm.foo for every foo function we want to optimize. IMO, a better solution is to introduce *one* new function attribute "is normal" or "may not be normal" (name suggestions welcome, pick your parity) which can be set on a function. All existing libcall optimization stuff would be predicated on the function being named the right thing, external, and "normal". This would allow language independent optimizations like instcombine to do this stuff.

-Chris

1) in practice, IMO, everyone building for user space has libc in
their address space, even if they are supporting some non-C language
like haskell or ocaml or something.

This is not true at least for the Free Pascal Compiler on non-Mac OS X and non-Solaris platforms. Normally, we directly use syscalls and link to nothing system-installed. The reason for this approach is that on e.g. Linux and FreeBSD, it almost guarantees spotless binary compatibility across different versions and/or distributions (while depending on any system library whatsoever almost guarantees absence of binary compatibility over time in most situations).

On the other hand, since we uppercase and mangle all of our assembler-level identifiers by default, it is unlikely that an identifier named "free" will actually exist. But it's not impossible (anyone can define aliases for functions/variables using any name).

5) regardless of #1/2, we want to support things like -fno-builtin-free someday. -ffree-standing would also imply this sort of thing, as would -no-builtins etc.

Yes, please!

Jonas