dragon egg adding extra characters to function names

Hello,

I’m looking at compiling some pieces of the standard library with llvm but I’m running into problems with some functions being renamed by dragonegg. For example, when I compile the acos implementation with plain gcc I get:

$ nm acos.o
0000000000000000 r .LC1
0000000000000048 r .LC10
0000000000000050 r .LC11
0000000000000058 r .LC12
0000000000000060 r .LC13
0000000000000068 r .LC14
0000000000000070 r .LC15
0000000000000008 r .LC2
0000000000000010 r .LC3
0000000000000018 r .LC4
0000000000000020 r .LC5
0000000000000028 r .LC6
0000000000000030 r .LC7
0000000000000038 r .LC8
0000000000000040 r .LC9
0000000000000000 T __GI_acos
0000000000000000 T __ieee754_acos
U __ieee754_sqrt
0000000000000000 T acos

but when I compile with dragonegg, I get:

$ llvm-nm acos.bc.o
T __GI_acos
T acos
T __ieee754_acos
U __ieee754_sqrt

Why does LLVM do this? I assume that it has something to do with the fact that LLVM treats the function “acos” as special and so doesn’t want it to be redefined, is that correct? I’ve been running a post-processing pass that finds functions and function references with the character and removes it, but this kind of messes up my workflow. Is there anyway to tell LLVM to not do this?

Thanks.

Hi Gregory,

I'm looking at compiling some pieces of the standard library with llvm but I'm
running into problems with some functions being renamed by dragonegg. For
example, when I compile the acos implementation with plain gcc I get:

$ nm acos.o
0000000000000000 r .LC1
0000000000000048 r .LC10
0000000000000050 r .LC11
0000000000000058 r .LC12
0000000000000060 r .LC13
0000000000000068 r .LC14
0000000000000070 r .LC15
0000000000000008 r .LC2
0000000000000010 r .LC3
0000000000000018 r .LC4
0000000000000020 r .LC5
0000000000000028 r .LC6
0000000000000030 r .LC7
0000000000000038 r .LC8
0000000000000040 r .LC9
0000000000000000 T __GI_acos
0000000000000000 T __ieee754_acos
                  U __ieee754_sqrt
0000000000000000 T acos

but when I compile with dragonegg, I get:

$ llvm-nm acos.bc.o
          T __GI_acos
          T acos
          T __ieee754_acos
          U __ieee754_sqrt

I don't see any functions being renamed: the names seem to be the same just
in a different order...

Ciao, Duncan.

Hi Duncan,

Ah, non-unicode email… In the llvm output there should be a “1” character, i.e. (char) 0x01, prepended to acos and __GI_acos. I’m unable to get it on smaller things, but it happens when I try to compile uClibc with llvm. I’ve attached the .o and the .bc for comparison. The text file is the result after preprocessing (to avoid having to download a bunch of stuff).

Here is the compile line that I’m running for the llvm compilation:
llvm-gcc -emit-llvm -S -c pp.c -o /tmp/tmpU_bCHo.ll -c
llvm-as /tmp/tmpU_bCHo.ll -o=pp.bc.o
(note that llvm-gcc is gcc-4.5 setup to generate assembly when it gets -emit-llvm).

The non-llvm compilation is from:
gcc-4.5 -c pp.c -o pp.o

I thought this was something coded into llvm, but maybe it is just my scripts messing things up somehow…

pp.bc.o (1.84 KB)

pp.o (4.23 KB)

pp.c (41.4 KB)

Hi Gregory,

Ah, non-unicode email... In the llvm output there should be a "1" character,
i.e. (char) 0x01, prepended to acos and __GI_acos. I'm unable to get it on
smaller things, but it happens when I try to compile uClibc with llvm. I've
attached the .o and the .bc for comparison. The text file is the result after
preprocessing (to avoid having to download a bunch of stuff).

this is normal, and should not turn up in final assembler output. Internally
GCC uses a leading "*" character while LLVM uses a leading "\1" for indicating
that the symbol should turn up in the assembler as is (without the leading
character */\1) and not with something prepended to it (on some platforms an
underscore would normally be prepended to all symbol names for example). Since
llvm-nm works on bitcode, it shows you bitcode names including the leading "\1"
(maybe this is an llvm-nm bug?). However the code generators should eliminate
it.

Ciao, Duncan.

Thanks Duncan.