Customize Standard C Library Using LLVM (to support llvm backend optimization)

Purpose:

I implemented a pass on LLVM backend that changes the output format of ARM assembly/binary (e.g add a jump at the end of each basic block to eliminate fall through). By calling:

llc -march=arm somefile.bc

it generates expected arm assembly/binary that runs properly on arm gnu linux (I use qemu-arm and gem5 to simulate it). Now I want to do the same thing on standard c library, but here are problems.

Problems:

According to:

[http://article.gmane.org/gmane.comp.compilers.llvm.devel/77025](http://article.gmane.org/gmane.comp.compilers.llvm.devel/77025) 
[https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_14/docs/OpenProjects.html#glibc](https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_14/docs/OpenProjects.html#glibc)   

compiling glibc using llvm may not be a proper option. On the other hand, according to:

[http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/047088.html](http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/047088.html)

llvm could be able to compile newlib, thus people consider newlib as an alternative. However, according to:

[http://www.embecosm.com/appnotes/ean9/ean9-howto-newlib-1.0.html#id2711887](http://www.embecosm.com/appnotes/ean9/ean9-howto-newlib-1.0.html#id2711887)

newlib intends to support binaries for bare metal (no OS) software. It implements only the hardware independent parts (e.g libc and libm) and leave a stub for each hardware dependent syscall (e.g everything in libgloss).

In fact I tried to compile a simple “hello world” c program using arm-none-eabi-gcc which was configured with “–with-newlib” option, the program execution ends up with segmentation faults on both qemu-arm and gem5.

Questions:

I’m not sure if the newlib is compatible with glibc. I’m wondering if I could use llvm to cross-compile the machine independent parts (at the same time change the arm output format) from newlib and use arm-none-linux-gnueabi-gcc to cross-compile the machine dependent parts from glibc and put these two parts together to generate my own standard c library?

There might be mistakes/misunderstandings in my work. Are there any other possible methods that could add my changes to at least part of the standard c libraries, and make the program run on qemu-arm or gem5?

  Purpose:

I implemented a pass on LLVM backend that changes the output format of
/ARM/ assembly/binary (e.g add a jump at the end of each basic block to
eliminate fall through). By calling:

>llc -march=arm somefile.bc
>

it generates expected arm assembly/binary that runs properly on arm gnu
linux (I use qemu-arm and gem5 to simulate it). Now I want to do the
same thing on standard c library, but here are problems.

  Problems:

According to:

>http://article.gmane.org/gmane.comp.compilers.llvm.devel/77025
https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_14/docs/OpenProjects.html#glibc
>

compiling glibc using llvm may not be a proper option. On the other
hand, according to:

>http://lists.cs.uiuc.edu/pipermail/llvmdev/2012-January/047088.html
>

llvm could be able to compile newlib, thus people consider newlib as an
alternative. However, according to:

FWIW, I build baremetal newlib for arm-eabi using clang, and it works. I had to patch a few of the __attribute__((naked)) functions because they were using pre-UAL asm syntax, but for the most part it "just works".

>Howto: Porting newlib
>

newlib intends to support binaries for bare metal (no OS) software. It
implements only the hardware independent parts (e.g libc and libm) and
leave a stub for each hardware dependent syscall (e.g everything in
libgloss).

Have you considered trying musl? It's supposed to be a full replacement for glibc.

In fact I tried to compile a simple "hello world" c program using
arm-none-eabi-gcc which was configured with "--with-newlib" option, the
program execution ends up with segmentation faults on both qemu-arm and
gem5.

Have you run it in a debugger to figure out *why* it is segfaulting? Have you tried building it without your special pass?

Cheers,

Jon

Purpose:

I implemented a pass on LLVM backend that changes the output format of
*ARM* assembly/binary (e.g add a jump at the end of each basic block to
eliminate fall through). By calling:

llc -march=arm somefile.bc

it generates expected arm assembly/binary that runs properly on arm gnu
linux (I use qemu-arm and gem5 to simulate it). Now I want to do the same
thing on standard c library, but here are problems.
Problems:

According to:

http://article.gmane.org/gmane.comp.compilers.llvm.devel/77025 https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_14/docs/OpenProjects.html#glibc

compiling glibc using llvm may not be a proper option.

Not an option /yet/, but it's a work in progress, by the sounds of it.
Kostya's been working on it recently & it sounds like there's one or two
things in the source code to tackle, then some build issues?

Have you sent this patch upstream? This is the kind of thing we have
to get rid for both Clang and GCC sakes.

cheers,
--renato

FWIW, I build baremetal newlib for arm-eabi using clang, and it works. I had
to patch a few of the __attribute__((naked)) functions because they were
using pre-UAL asm syntax, but for the most part it "just works".

Have you sent this patch upstream? This is the kind of thing we have
to get rid for both Clang and GCC sakes.

Not yet :confused:

It's on my TODO list.

Jon

Purpose:

I implemented a pass on LLVM backend that changes the output format of
*ARM* assembly/binary (e.g add a jump at the end of each basic block to
eliminate fall through). By calling:

llc -march=arm somefile.bc

it generates expected arm assembly/binary that runs properly on arm gnu
linux (I use qemu-arm and gem5 to simulate it). Now I want to do the same
thing on standard c library, but here are problems.
Problems:

According to:

http://article.gmane.org/gmane.comp.compilers.llvm.devel/77025 https://llvm.org/svn/llvm-project/llvm/tags/RELEASE_14/docs/OpenProjects.html#glibc

compiling glibc using llvm may not be a proper option.

Not an option /yet/, but it's a work in progress, by the sounds of it.
Kostya's been working on it recently & it sounds like there's one or two
things in the source code to tackle, then some build issues?

I've been successfully building glibc's libc.so with clang (and asan!) for
the past two weeks,
but the process is manual and still requires a few patches and ugly hacks.
https://sourceware.org/glibc/wiki/GlibcMeetsClang

FWIW, I build baremetal newlib for arm-eabi using clang, and it works. I

had to patch a few of the __attribute__((naked)) functions because they
were using pre-UAL asm syntax, but for the most part it "just works".

I build the baremetal newlib using arm-none-eabi-gcc as well, but after
linking with the hello world program, it failed to run on both qemu-arm and
gem5.

Have you considered trying musl? It's supposed to be a full replacement

for glibc.

It looks like a nice alternative, I'll certainly look into it right away.

Have you run it in a debugger to figure out *why* it is segfaulting?

Have you tried building it without your special pass?

Well, I didn't try clang, I use arm-none-eabi-gcc to build the newlib and

the hello world program. The arm-none-eabi-gdb says "Don't know how to
run. Try "help target"."

I can confirm that musl builds and works correctly with clang/llvm. We are using musl as a libc for our architecture.
It has a much smaller code footprint than newlib or glibc.

Regards,

Richard

gdb doesn't know how to run your program because gdb only knows how to run programs natively. You need to run it in qemu, and attach gdb to that, then debug as if you were debugging natively.

I do something roughly like this (this is from memory, so the actual commands might be slightly different. ymmv):

    $ cat hw.c
    #include <stdio.h>
    int main() { printf("hello world!\n"); }
    $ clang -target arm-none-eabi -march=armv4t hw.c -T generic-hosted.ld
    $ qemu-system-arm -semihosting -M integratorcp -cpu arm1026 -kernel a.out -s -S -g 1234 &
    $ arm-none-eabi-gdb
    (gdb) target remote localhost:1234
    (gdb) file a.out
    (gdb) break main
    (gdb) r
    (gdb) c
    hello world!
    (gdb) q
    $

HTH,

Jon

I learn a lot from this, thanks!

I successfully cross-compile the must-libc using clang, with the
configuration:

C=clang CFLAGS=--target=arm-none-linux-gnueabi\
--sysroot=/usr/local/arm-2009q\ -I\
/usr/local/arm-2009q3/arm-none-linux-gnueabi/libc/usr/include/ LIBCC=
./configure --target=arm

But how to force clang to use musl-libc rather than its default one? I
tried

clang --target arm-none-linux-gnueabi -nostdlib -static hello.c
~/research/musl-1.1.6/lib/crt1.o ~/research/musl-1.1.6/lib/crti.o
~/research/musl-1.1.6/lib/crtn.o -I ~/research/musl-1.1.6/include/ -L
~/research/musl-1.1.6/lib/ -L
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1/ -lc -lgcc
-lgcc_eh

clang complaint:

/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/lib/../lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/usr/lib/../lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/usr/lib" is unsafe for cross-compilation
/home/yanchao/research/musl-1.1.6/lib//libc.a(__libc_start_main.o): In
function `__libc_start_main':
src/env/__libc_start_main.c:(.text+0x30): undefined reference to
`__aeabi_memset'
/home/yanchao/research/musl-1.1.6/lib//libc.a(vfprintf.o): In function
`vfprintf':
src/stdio/vfprintf.c:(.text+0x28): undefined reference to `__aeabi_memset'
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1//libgcc.a(_dvmd_lnx.o):
In function `__aeabi_ldiv0':
(.text+0x8): undefined reference to `raise'
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1//libgcc_eh.a(unwind-arm.o):
In function `unwind_phase2':
unwind-arm.c:(.text+0xae4): undefined reference to `abort'
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1//libgcc_eh.a(unwind-arm.o):
In function `__gnu_Unwind_Resume':
unwind-arm.c:(.text+0xbe8): undefined reference to `abort'
unwind-arm.c:(.text+0xc10): undefined reference to `abort'
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1//libgcc_eh.a(pr-support.o):
In function `_Unwind_GetTextRelBase':
pr-support.c:(.text+0x4): undefined reference to `abort'
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1//libgcc_eh.a(pr-support.o):
In function `_Unwind_GetDataRelBase':
pr-support.c:(.text+0xc): undefined reference to `abort'

Regards,
Chao

2015-03-11 16:22 GMT-05:00 Richard Gorton
<rcgorton@cognitive-electronics.com
<mailto:rcgorton@cognitive-electronics.com>>:

    I can confirm that musl builds and works correctly with clang/llvm.
    We are using musl as a libc for our architecture.
    It has a much smaller code footprint than newlib or glibc.

I successfully cross-compile the must-libc using clang, with the
configuration:

C=clang CFLAGS=--target=arm-none-linux-gnueabi\
--sysroot=/usr/local/arm-2009q\ -I\
/usr/local/arm-2009q3/arm-none-linux-gnueabi/libc/usr/include/ LIBCC=
./configure --target=arm

But how to force clang to use musl-libc rather than its default one? I
tried

clang --target arm-none-linux-gnueabi -nostdlib -static hello.c
~/research/musl-1.1.6/lib/crt1.o ~/research/musl-1.1.6/lib/crti.o
~/research/musl-1.1.6/lib/crtn.o -I ~/research/musl-1.1.6/include/ -L
~/research/musl-1.1.6/lib/ -L
/usr/local/arm-2009q3/lib/gcc/arm-none-linux-gnueabi/4.4.1/ -lc -lgcc
-lgcc_eh

clang complaint:

/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/lib/../lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/usr/lib/../lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/lib" is unsafe for cross-compilation
/usr/local/arm-2009q3/bin/arm-none-linux-gnueabi-ld: warning: library
search path "/usr/lib" is unsafe for cross-compilation

You need to build a sysroot from the lib and include directories in ~/research/musl-1.1.6 combined with the same folders from /usr/local/arm-2009q, then use `--sysroot` instead of the `-I`s and `-L's.

I copied everything from the lib in musl-1.1.6
to arm-2009q3/arm-none-linux-gnueabi/libc/usr/lib but there is still an
error message:

src/env/__libc_start_main.c:(.text+0x40): undefined reference to
`__aeabi_memset'
collect2: ld returned 1 exit status

it seems that I need an arm-none-linux-gcc-4.9 (current version from
arm-2009q3 is gcc-4.4) to support the `__aeabi_memset' function?

Regards,
Chao

2015-03-12 10:49 GMT-05:00 Jonathan Roelofs <jonathan@codesourcery.com
<mailto:jonathan@codesourcery.com>>:

    You need to build a sysroot from the lib and include directories in
    ~/research/musl-1.1.6 combined with the same folders from
    /usr/local/arm-2009q, then use `--sysroot` instead of the `-I`s and
    `-L's.

I copied everything from the lib in musl-1.1.6
to arm-2009q3/arm-none-linux-gnueabi/libc/usr/lib but there is still an
error message:

src/env/__libc_start_main.c:(.text+0x40): undefined reference to
`__aeabi_memset'
collect2: ld returned 1 exit status

it seems that I need an arm-none-linux-gcc-4.9 (current version from
arm-2009q3 is gcc-4.4) to support the `__aeabi_memset' function?

This isn't a gcc support mailing list... If you have questions pertaining to llvm/clang, we'd be happy to answer them.

That being said, that symbol should be part of the compiler's runtime library (or maybe in the libc?). llvm provides that symbol as part of libclangrt/libcompiler_rt.

Jon

2015-03-12 15:07 GMT-05:00 Jonathan Roelofs <jonathan@codesourcery.com
<mailto:jonathan@codesourcery.com>>:

    This isn't a gcc support mailing list... If you have questions
    pertaining to llvm/clang, we'd be happy to answer them.

    That being said, that symbol should be part of the compiler's
    runtime library (or maybe in the libc?). llvm provides that symbol
    as part of libclangrt/libcompiler_rt.

The problem is basically caused by llvm. I cross-compile the musl-libc
to arm binaries using clang. However, the cross-compiled musl-libc needs
runtime support from libcompiler_rt. The arm version of this static
runtime library cannot be obtained by cross compiling.

Where did you read that? Because it's plainly not true.

libcompiler_rt is *designed* to be cross-built, and building the ARM version of it from x86 Linux/Darwin is well exercised by several members of the community, myself included.

Do I need to build llvm on an arm host machine?

No. LLVM/Clang is a cross compiler. Building host binaries is just a special, easier case of that.

Jon

Thank you for helping Jon. I'm still learning the llvm cross compiling
stuff, sometimes I might misunderstand it. I tried "make clang_linux" in
the compiler_rt directory, it would only build the "libcompiler_rt.a" for
x86_64, which is my host machine ISA. Could you please tell me how can I
build the arm version? I tried to modify the "make/platform/arm_linux.mk"
and "make/options.mk", but none of them work.

Thanks,
Chao