Global constructors "get lost" when transforming bitcode files

Hello,

A strange problem appears when upgrading from release_34 to testing. Some transformations to bitcode files cause registered global_ctors to not be called. Here’s an example (I’ve also attached the complete example and pasted it below):

This works:

clang -fsanitize=address -flto -c -o sum.o sum.c
clang -fsanitize=address -o sum sum.o

This doesn’t work:

clang -fsanitize=address -flto -c -o sum.o sum.c
llvm-dis sum.o
llvm-as sum.o.ll -o sum.o
clang -fsanitize=address -o sum sum.o

The second version segfaults when accessing shadow memory, because the memory has not been initialized, because _asan_init* was never called. This is surprising, because in the llvm-disassembly the global constructor still shows up.

The llvm-dis/llvm-as operation should be a no-op, yet the global_ctors get lost in the process. This happens also with other operations that affect global_ctors, e.g., with “opt -insert-gcov-profiling”.

The problem does not occur in the release_34 branch, but I have seen it on both testing and master.

Any idea where this could come from would be much appreciated!
Jonas

testcase:

cat >sum.c <<EOF
#include <stdio.h>
#include <assert.h>

int main() {
const int MAX_SIZE = 100;
int a[MAX_SIZE];

for (int i = 0; i < MAX_SIZE; ++i) {
a[i] = i * i + 4;
}

int n_numbers;
printf(“How many numbers should I sum up? “);
scanf(”%d”, &n_numbers);

int sum = 0;
for (int i = 0; i < n_numbers; ++i) {
sum += a[i];
}

assert(sum >= 0);
printf(“The sum is: %d\n”, sum);
return 0;
}
EOF

set -ex
rm -f *.o *.ll sum

clang -fsanitize=address -flto -c -o sum.o sum.c

llvm-dis sum.o
llvm-as sum.o.ll -o sum.o

clang -fsanitize=address -o sum sum.o

echo 22 | ./sum

test_ctors.sh (638 Bytes)

Hello,

I’ve narrowed the issue down to revision 209015: Add comdat key field to llvm.global_ctors and llvm.global_dtors.

This revision introduces an additional field to global constructors which, when non-null, can prevent constructors from being called. In my case, this field is always set to null, yet global constructors are not called.

I’ve observed that this happens only on Mac, not on Linux. In the patch, I haven’t found anything OS-specific. However, the issue might be due to different linker implementations.

Does anybody have additional insights into this? Should I file a bug report?

Best regards,
Jonas

Probably a good idea.

Rafael and David were looking at this sort of thing when I checked last...

-eric

Reid added this feature, I bet he’d be interested.

Are you setting DYLD_LIBRARY_PATH when running on OS X? The linker on
OS X uses that to findl ibLTO.dylib.

At least for me setting DYLD_LIBRARY_PATH when running clang in the
linking stage fixes the problem.

I would still be curious to know how the old (system) libLTO.dylib can
read the bitcode output by clang but not llvm-as.

This isn't strictly true. Deserializing old bitcode will auto-upgrade it.
In this case, it will be auto-upgraded to the 3-field form of
llvm.global_ctors.

Hi,

I think I see what's going on: Clang will emit the new, 3-field form for
C++, but you're compiling C. That means there are no ctors until LLVM
runs, at which point it creates a 2-field form ctor array. It's likely
that C++ LTO with TOT clang and the system linker doesn't work right now,
but that's OK.