why we assume malloc() always returns a non-null pointer in instruction combing?

Hi,

When looking into the bug in https://llvm.org/bugs/show_bug.cgi?id=21421, I found a regression test in Transforms/InstCombine/malloc-free-delete.ll against me to directly fix it. The test is,

define i1 @foo() {
; CHECK-LABEL: @foo(
; CHECK-NEXT: ret i1 false
%m = call i8* @malloc(i32 1)
%z = icmp eq i8* %m, null
call void @free(i8* %m)
ret i1 %z
}

According to http://www.cplusplus.com/reference/cstdlib/malloc/, malloc may return null if this memory allocation fails. So why we assume malloc() always returns a non-null pointer here?

I think we can do such optimization with operator new, because new never returns null. But for all malloc like allocation(malloc, calloc, and new with std::nothrow), we shouldn’t do this.

That regression test exists for a long time, I’m not sure if there’s any special reason. Does anybody know about this?

The optimisation here is that "nothing uses `m`, so we can assume
allocation works and remove the malloc + free pair".

What is the purpose of allocating 1 (or 100, or 1000000000) bytes,
never use it, and then free it immediately?

The test-code in the bug report does rely on the constructor being
called, and the bug here is probably [as I'm not familiar with the
workings of the compiler in enough detail] that it doesn't recognize
that the constructor has side-effects.

I think we can do such optimization with operator new, because new never returns null.

This is incorrect in the case of `new (std::nothrow) ...` - the whole
point of `(std::nothrow)` is to tell new that it should return NULL in
case of failure, rather than throw an exception (bad_alloc).

But the point here is not the actual return value, but the fact that
the compiler misses that the constructor has side-effects.

Yes, I classified new (std::nothrow) to be a malloc like allocation. See the next sentence.

Hi Mats,

I think Kevin’s point is malloc can return 0, if malloc/free pair is optimized way, the semantic of the original would be changed.

On the other hand, malloc/free are special functions, but programmers can still define their own versions by not linking std library, so we must assume malloc/free always have side-effect like other common functions, unless we know we will link std library only at link-time.

Thanks,
-Jiangning

Hi Mats,

I think Kevin's point is malloc can return 0, if malloc/free pair is
optimized way, the semantic of the original would be changed.

On the other hand, malloc/free are special functions, but programmers can
still define their own versions by not linking std library, so we must
assume malloc/free always have side-effect like other common functions,
unless we know we will link std library only at link-time.

If programmers want to do this, they need to compile their program with
-ffreestanding.

Hi David and Mats,

Thanks for your explanation. If my understanding is correct, it means we don’t need to consider the side-effect of malloc/free unless compiling with -ffreestanding. Because without -ffreestanding, user defined malloc/free should be compatible with std library. It makes sense to me.

My point is, in std library, malloc is allowed to return null if this malloc failed. Why compiler knows it must succeed at compile time? I slightly modified the regression case,

define i1 @CanWeMallocWithSize(i32 a) {
; CHECK-LABEL: @foo(
; CHECK-NEXT: ret i1 false
%m = call i8* @malloc(i32 a)
%z = icmp eq i8* %m, null
call void @free(i8* %m)
ret i1 %z
}

It’s possible that this function is used to detect whether the runtime environment can malloc a block of memory with size a. Besides, this function can help to apply a large block of memory from system to memory allocator and reduce the system call from a lot of malloc with small size next. At some extreme situations, it may fail to pass this check, then program can show a decent error message and stop. So the problem is, it’s not simply malloc a size of memory and then directly free it, but the pointer from malloc is used to compare with null and finally affect the return value. So this optimization may change the original semantic.

Thanks,
Kevin

Hi David and Mats,

Thanks for your explanation. If my understanding is correct, it means we
don't need to consider the side-effect of malloc/free unless compiling with
-ffreestanding. Because without -ffreestanding, user defined malloc/free
should be compatible with std library. It makes sense to me.

My point is, in std library, malloc is allowed to return null if this
malloc failed. Why compiler knows it must succeed at compile time? I
slightly modified the regression case,

define i1 @CanWeMallocWithSize(i32 a) {
; CHECK-LABEL: @foo(
; CHECK-NEXT: ret i1 false
  %m = call i8* @malloc(i32 a)
  %z = icmp eq i8* %m, null
  call void @free(i8* %m)
  ret i1 %z
}

It's possible that this function is used to detect whether the runtime
environment can malloc a block of memory with size a. Besides, this
function can help to apply a large block of memory from system to memory
allocator and reduce the system call from a lot of malloc with small size
next. At some extreme situations, it may fail to pass this check, then
program can show a decent error message and stop. So the problem is, it's
not simply malloc a size of memory and then directly free it, but the
pointer from malloc is used to compare with null and finally affect the
return value. So this optimization may change the original semantic.

A program cannot rely on prior call to a pair of malloc and free to suggest
that a subsequent call to malloc might succeed. In fact, a valid
implementation of a debug malloc might unconditionally report that the nth
call to malloc will fail in order to help find bugs in a program.

I don’t think it is related to the intention of programmer using malloc. The C standard clearly claims a null pointer could be returned by malloc. This semantic must be kept, and this would be a ABI level breakage otherwise.

Thanks,
-Jiangning

Hi Jiangning,

Sorry, I don’t buy that argument. I don’t see why the compiler statically emulating the behaviour of a well behaved malloc/free pair is any different to it inlining a version of strcmp() (the library may have a strcmp that just returns -1 - the standard says it’s allowed to), or doing constant propagation with well known library calls such as fabs().

The non -ffreestanding behaviour is that the compiler knows it is sitting on top of a C library and it knows vaguely what a C library behaves like. Granted, malloc() is one of the few C library functions that the compiler can do something with that can have sideeffects, but removing it completely is certainly a good thing.

Consider:

int *my_useless_buffer = malloc(LOTS);
for (n : X) {
my_useless_buffer[0] += n;
}
free(my_useless_buffer);

The compiler would be expected to reduce my_useless_buffer to a single int and remove the malloc. I agree with David that -ffreestanding is the way to inform the compiler that it shouldn’t make any assumptions about malloc/free/strcmp/memcpy/memset…

Cheers,

James

There was some previous discussion on this optimization back in 2008:

https://groups.google.com/forum/#!topic/llvm-dev/lV30rcmF0ss

I found John Regehr's explaination helpful:

"To say that LLVM *assumes* that malloc() succeeds or fails is misleading. This misstatement may be the root of people's ongoing problems in understanding this transformation and its validity.

The right way to think about it is: LLVM is supplying an alternate implementation of malloc that happens to run at compile time, and happens
to succeed all the time. This implementation is -- as I understand the C standard -- a perfectly legal one. It has nothing to do with the version of malloc() found in libc except that the two implementations share a common API."