Is valid optimization: dropping new[]

Hello!

While trying to build ACE framework [1] using clang 3.2 with -O2 optimization I got configure script hang while checking if operator new [] throws std::bad_alloc exception (or just returns NULL).

$ cat a.cpp
#include <stdexcept>

int main()
{
     for (;:wink: {
         try {
             char *p = new char[1024];
             if (p == NULL) {
                 return 1; // bad
             }
         }
         catch (std::bad_alloc&) {
             return 0; // good
         }
     }
     return 0;
}

$ /opt/clang-clang/bin/clang++ -v -O2 a.cpp
clang version 3.2 (branches/release_32 174320) (llvm/branches/release_32 174317)
Target: x86_64-unknown-linux-gnu
Thread model: posix
  "/opt/clang-clang/bin/clang" -cc1 -triple x86_64-unknown-linux-gnu -emit-obj -disable-free -disable-llvm-verifier -main-file-name a.cpp -mrelocation-model static -fmath-errno -masm-verbose -mconstructor-aliases -munwind-tables -target-cpu x86-64 -target-linker-version 2.17.50.0.6 -momit-leaf-frame-pointer -v -resource-dir /opt/clang-clang/bin/../lib/clang/3.2 -fmodule-cache-path /var/tmp/clang-module-cache -internal-isystem /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6/x86_64-redhat-linux6E -internal-isystem /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6/backward -internal-isystem /usr/local/include -internal-isystem /opt/clang-clang/bin/../lib/clang/3.2/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -O2 -fdeprecated-macro -fdebug-compilation-dir /tb/builds/thd/sbn/2.6/src/share/package -ferror-limit 19 -fmessage-length 237 -mstackrealign -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /tmp/a-K2ND4b.o -x c++ a.cpp
clang -cc1 version 3.2 based upon LLVM 3.2svn default target x86_64-unknown-linux-gnu
ignoring nonexistent directory "/include"
#include "..." search starts here:
#include <...> search starts here:
  /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6
  /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6/x86_64-redhat-linux6E
  /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../include/c++/4.4.6/backward
  /usr/local/include
  /opt/clang-clang/bin/../lib/clang/3.2/include
  /usr/include
End of search list.
  "/usr/bin/ld" --hash-style=gnu --no-add-needed --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o a.out /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../lib64/crt1.o /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../lib64/crti.o /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/crtbegin.o -L/usr/lib/gcc/x86_64-redhat-linux6E/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../.. -L/lib -L/usr/lib /tmp/a-K2ND4b.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/crtend.o /usr/lib/gcc/x86_64-redhat-linux6E/4.4.6/../../../../lib64/crtn.o

$ objdump -d a.out | c++filt | grep -A 3 '<main>'
00000000004004c0 <main>:
   4004c0: eb fe jmp 4004c0 <main>
   4004c2: 90 nop
   4004c3: 90 nop
   4004c4: 90 nop

So it optimized it to infinite loop.
(operator new()) that have side effects.
Or I'm missing something?

Thanks!

[1] http://www.cs.wustl.edu/~schmidt/ACE.html

The optimizer is making not-strictly-standard assumptions about the
behavior of global operator new[] and the merits of intentionally triggering
an out-of-memory condition with a leak vs. promoting leaked allocations
to the stack. You can disable these assumptions with -fno-builtin, avoid
them by compiling at -O0, or work around them by assigning each 'p' in
turn to a volatile global variable, which will stop the compiler from realizing
that they leak.

John.

Yes, making p to be 'static volatile char *' fixed the problem, thank you!

Do I understand you right that the compiler did the following:
1. replaced 'new char[1024]' with smth like 'alloca(1024)'
2. since alloca() cannot fail it removed try/catch and if() statements.
3. since the result isn't used 'alloca()' call was removed also.

Well for 1024 bytes I think it's reasonable, but I can see the same behavior when changing 1024 to smth bigger, like 2^30 which makes this transformation wrong from my point of view.

BTW I used to think that -O2 is more or less safe optimization level.

The optimizer is making not-strictly-standard assumptions about the
behavior of global operator new[] and the merits of intentionally triggering
an out-of-memory condition with a leak vs. promoting leaked allocations
to the stack. You can disable these assumptions with -fno-builtin, avoid
them by compiling at -O0, or work around them by assigning each 'p' in
turn to a volatile global variable, which will stop the compiler from realizing
that they leak.

Yes, making p to be 'static volatile char *' fixed the problem, thank you!

Making it "char * volatile" would be a more stable workaround; it should
prevent the optimizer from reasoning about the store at all.

Do I understand you right that the compiler did the following:
1. replaced 'new char[1024]' with smth like 'alloca(1024)'
2. since alloca() cannot fail it removed try/catch and if() statements.
3. since the result isn't used 'alloca()' call was removed also.

Something approximately like this, although I don't know the details.

Well for 1024 bytes I think it's reasonable, but I can see the same behavior when changing 1024 to smth bigger, like 2^30 which makes this transformation wrong from my point of view.

Yes, I certainly agree that arbitrary allocations should not be moved to the stack. I don't know the limits on this optimization. It may be that it wouldn't normally just turn the heap-allocation into a stack-allocation, and it's only because it's obviously completely unused that it removes the allocation entirely.

John.

Quoting Dmitri Shubin <sbn@tbricks.com>:

The optimizer is making not-strictly-standard assumptions about the
behavior of global operator new[] and the merits of intentionally triggering
an out-of-memory condition with a leak vs. promoting leaked allocations
to the stack. You can disable these assumptions with -fno-builtin, avoid
them by compiling at -O0, or work around them by assigning each 'p' in
turn to a volatile global variable, which will stop the compiler from realizing
that they leak.

Yes, making p to be 'static volatile char *' fixed the problem, thank you!

Do I understand you right that the compiler did the following:
1. replaced 'new char[1024]' with smth like 'alloca(1024)'
2. since alloca() cannot fail it removed try/catch and if() statements.
3. since the result isn't used 'alloca()' call was removed also.

That's not really what's happening.
The call to 'new' (or malloc or any other memory allocation function) is removed if the only uses of the return value are the following:
  - comparisons with NULL
  - calls to free() / delete

There is an optimization that can promote a heap allocation to a stack allocation, but only for small allocation sizes, of course.

Nuno