lld-link crash when build openssl with LTO

Hi Rui,

We met a lld-link crash problem when build 32bits openssl1.0 with LTO in uefi firmware. We narrow down and figure out a simple test case to reproduce this problem as blow. Please advise. Thank you!

$ cat main.c

void TlsDriverEntryPoint ()

{

unsigned char *ret = 0;

const unsigned char cryptopro_ext[17] = {0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x01};

int length =17;

const char *Source;

Source = (void*)cryptopro_ext;

while (length–) {

*(ret++) = *(Source++);

}

}

$ cat memcpy.c

typedef unsigned int size_t;

void *memcpy(void *dest, const void *src, size_t n)

{

return 0;

}

$ cat makefile

CC_FLAGS= -Oz -flto -target i686-unknown-windows

CC = /home/jshi19/llvm/llvm-project/releaseinstall/bin/clang

DLINK_FLAGS = /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint

DLINK = /home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link

SLINK_FLAGS =

SLINK = /home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib

build:

“$(CC)” $(CC_FLAGS) -c -o main.obj main.c

“$(CC)” $(CC_FLAGS) -c -o memcpy.obj memcpy.c

“$(SLINK)” $(SLINK_FLAGS) /OUT:main.lib main.obj

“$(SLINK)” $(SLINK_FLAGS) /OUT:memcpy.lib memcpy.obj

“$(DLINK)” /OUT:f.dll $(DLINK_FLAGS) main.lib memcpy.lib

$ make

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -c -o main.obj main.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -c -o memcpy.obj memcpy.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:main.lib main.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:memcpy.lib memcpy.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link” /OUT:f.dll /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint main.lib memcpy.lib

Stack dump:

  1. Program arguments: /home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link /OUT:f.dll /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint main.lib memcpy.lib

#0 0x000055f11ed8585a llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/jshi19/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:498:0

#1 0x000055f11ed83684 llvm::sys::RunSignalHandlers() /home/jshi19/llvm/llvm-project/llvm/lib/Support/Signals.cpp:68:0

#2 0x000055f11ed837c2 SignalHandler(int) /home/jshi19/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:357:0

#3 0x00007f172a5f2890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)

#4 0x000055f11edd8025 lld::coff::DefinedRegular::getChunk() const /home/jshi19/llvm/llvm-project/lld/COFF/Symbols.h:176:0

#5 0x000055f11edd8025 operator() /home/jshi19/llvm/llvm-project/lld/COFF/MarkLive.cpp:46:0

#6 0x000055f11edd8025 lld::coff::markLive(llvm::ArrayReflld::coff::Chunk*) /home/jshi19/llvm/llvm-project/lld/COFF/MarkLive.cpp:55:0

#7 0x000055f11edb763e std::vector<lld::coff::Chunk*, std::allocatorlld::coff::Chunk* >::~vector() /usr/include/c++/7/bits/stl_vector.h:434:0

#8 0x000055f11edb763e lld::coff::LinkerDriver::link(llvm::ArrayRef<char const*>) /home/jshi19/llvm/llvm-project/lld/COFF/Driver.cpp:1840:0

#9 0x000055f11edb7d08 lld::coff::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&) /home/jshi19/llvm/llvm-project/lld/COFF/Driver.cpp:78:0

#10 0x000055f11ecfa044 main /home/jshi19/llvm/llvm-project/lld/tools/lld/lld.cpp:155:0

#11 0x00007f17290c9b97 __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/…/csu/libc-start.c:344:0

#12 0x000055f11ed555ba _start (/home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link+0x25a5ba)

Segmentation fault (core dumped)

makefile:12: recipe for target ‘build’ failed

make: *** [build] Error 139

Steven

Thanks

I’ve submitted a BZ for this issue as below:

Bug 42626 - lld-link crash when build openssl with LTO

https://bugs.llvm.org/show_bug.cgi?id=42626

Hi Steven,

One thing I noticed is that you are defining memcpy, which clang has an intrinsic with the same name. Can you try renaming it to a random name, like foobar, to see if the problem still exists?

Hi Rui,

For the test case in my previous email, if I change the memcpy to foobar in memcpy.c, the lld-link report linking error that it cannot find the _memcpy symbol as below. In uefi firmware, we have to explicitly implement these compiler intrinsic functions by ourselves.

jshi19@ub2-uefi-b01:~/llvm/wrongcode/lld-link3$ make

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -c -o main.obj main.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -c -o memcpy.obj memcpy.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:main.lib main.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:memcpy.lib memcpy.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link” /OUT:f.dll /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint main.lib memcpy.lib

lld-link: error: undefined symbol: _memcpy

referenced by lto.tmp:(_TlsDriverEntryPoint)

makefile:9: recipe for target ‘build’ failed

make: *** [build] Error 1

Thanks

Steven

lld should not crash in this case (so that’s a bug that needs fixing), but setting it aside, did you try adding -fno-builtin to clang so that clang doesn’t handle memcpy as a built-in function?

In my previous test case, after add the -fno-builtin to clang then build, the lld-link still has same crash as below:

$ make

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -fno-builtin -c -o main.obj main.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/clang” -Oz -flto -target i686-unknown-windows -fno-builtin -c -o memcpy.obj memcpy.c

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:main.lib main.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/llvm-lib” /OUT:memcpy.lib memcpy.obj

“/home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link” /OUT:f.dll /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint main.lib memcpy.lib

Stack dump:

  1. Program arguments: /home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link /OUT:f.dll /MACHINE:X86 /DLL /ENTRY:TlsDriverEntryPoint main.lib memcpy.lib

#0 0x000056348d5c185a llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/jshi19/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:498:0

#1 0x000056348d5bf684 llvm::sys::RunSignalHandlers() /home/jshi19/llvm/llvm-project/llvm/lib/Support/Signals.cpp:68:0

#2 0x000056348d5bf7c2 SignalHandler(int) /home/jshi19/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:357:0

#3 0x00007f200467a890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12890)

#4 0x000056348d614025 lld::coff::DefinedRegular::getChunk() const /home/jshi19/llvm/llvm-project/lld/COFF/Symbols.h:176:0

#5 0x000056348d614025 operator() /home/jshi19/llvm/llvm-project/lld/COFF/MarkLive.cpp:46:0

#6 0x000056348d614025 lld::coff::markLive(llvm::ArrayReflld::coff::Chunk*) /home/jshi19/llvm/llvm-project/lld/COFF/MarkLive.cpp:55:0

#7 0x000056348d5f363e std::vector<lld::coff::Chunk*, std::allocatorlld::coff::Chunk* >::~vector() /usr/include/c++/7/bits/stl_vector.h:434:0

#8 0x000056348d5f363e lld::coff::LinkerDriver::link(llvm::ArrayRef<char const*>) /home/jshi19/llvm/llvm-project/lld/COFF/Driver.cpp:1840:0

#9 0x000056348d5f3d08 lld::coff::link(llvm::ArrayRef<char const*>, bool, llvm::raw_ostream&) /home/jshi19/llvm/llvm-project/lld/COFF/Driver.cpp:78:0

#10 0x000056348d536044 main /home/jshi19/llvm/llvm-project/lld/tools/lld/lld.cpp:155:0

#11 0x00007f2003151b97 __libc_start_main /build/glibc-OTsEL5/glibc-2.27/csu/…/csu/libc-start.c:344:0

#12 0x000056348d5915ba _start (/home/jshi19/llvm/llvm-project/releaseinstall/bin/lld-link+0x25a5ba)

Segmentation fault (core dumped)

makefile:9: recipe for target ‘build’ failed

make: *** [build] Error 139

Thanks

Steven

Yeah, it crashes indeed. I can reproduce the problem locally. Let me see what is going on.

Teresa,

It looks like even if we compile source files with -fno-builtin and one of the source files have a definition of memcpy, LTO uses the builtin memcpy instead of a user-supplied one. Is this an intended behavior?

Usage of the builtin appears independent of LTO, see below.

With any of -fno-builtin, -fno-builtin-memcpy, and -ffreestanding, which are all typically used to prevent usage of memcpy calls, we still always get a memcpy builtin in TlsDriverEntryPoint(). I see this even without -flto (e.g. try with just -emit-llvm).

I guess it is because this memcpy is not coming from the original source, but rather from the initialization code created by clang for the cryptopro_ext local variable. The code that generates that must not honor -fno-builtin. I see this even when I remove -flto (and this gets converted to a call to _memcpy in the final assembly with or without -fno-builtin).

I can’t do the full LTO link with these options (don’t have a windows linker I guess?), and have been unsuccessful in getting the failure with various other options I tried. I wanted to look at the merged LTO code with save-temps.

What happens to the builtin created by clang in LTO mode that causes lld to seg fault?

Teresa

I added some analysis to the bug. I think (but am not 100% confident) that we are lazily the memcpy bitcode file from an archive after LTO has already happened, leading to a crash later. I guess LLD should have some kind of check that ensures we don’t load LTO objects after LTO has already run.

Usage of the builtin appears independent of LTO, see below.

With any of -fno-builtin, -fno-builtin-memcpy, and -ffreestanding, which are all typically used to prevent usage of memcpy calls, we still always get a memcpy builtin in TlsDriverEntryPoint(). I see this even without -flto (e.g. try with just -emit-llvm).

I guess it is because this memcpy is not coming from the original source, but rather from the initialization code created by clang for the cryptopro_ext local variable. The code that generates that must not honor -fno-builtin. I see this even when I remove -flto (and this gets converted to a call to _memcpy in the final assembly with or without -fno-builtin).

I can’t do the full LTO link with these options (don’t have a windows linker I guess?), and have been unsuccessful in getting the failure with various other options I tried. I wanted to look at the merged LTO code with save-temps.

You don’t need a Windows dev environment to build the given program. On Unix, you can just build clang and lld normally and run clang-cl and lld-link with the same arguments as you’d give to the command on Windows, and they work fine (as long as your program doesn’t use Windows headers nor libraries).

Usage of the builtin appears independent of LTO, see below.

With any of -fno-builtin, -fno-builtin-memcpy, and -ffreestanding, which are all typically used to prevent usage of memcpy calls, we still always get a memcpy builtin in TlsDriverEntryPoint(). I see this even without -flto (e.g. try with just -emit-llvm).

I guess it is because this memcpy is not coming from the original source, but rather from the initialization code created by clang for the cryptopro_ext local variable. The code that generates that must not honor -fno-builtin. I see this even when I remove -flto (and this gets converted to a call to _memcpy in the final assembly with or without -fno-builtin).

I can’t do the full LTO link with these options (don’t have a windows linker I guess?), and have been unsuccessful in getting the failure with various other options I tried. I wanted to look at the merged LTO code with save-temps.

You don’t need a Windows dev environment to build the given program. On Unix, you can just build clang and lld normally and run clang-cl and lld-link with the same arguments as you’d give to the command on Windows, and they work fine (as long as your program doesn’t use Windows headers nor libraries).

Yeah I’m not sure what I did wrong. In any case, pcc updated the bug with some ELF patches that need porting to COFF.