clang change function name

Hi,

I compile a .cpp with cmd:
clang++ -emit-llvm -c -g -O0 -w pbzip2.cpp -o pbzip2.bc -lbz2
llvm-dis pbzip2.bc

One function in .cpp is consumer_decompress. However, I look inside pbzip2.ll. The function name is changed to "define i8* @_Z19consumer_decompressPv(i8* %q) #0 {"

Why clang adds a "_Z19" prefix and "Pv" suffix?

Thanks,

Hi,

I compile a .cpp with cmd:
clang++ -emit-llvm -c -g -O0 -w pbzip2.cpp -o pbzip2.bc -lbz2
llvm-dis pbzip2.bc

One function in .cpp is consumer_decompress. However, I look inside pbzip2.ll. The function name is changed to "define i8* @_Z19consumer_decompressPv(i8* %q) #0 {"

Why clang adds a "_Z19" prefix and "Pv" suffix?

Clang mangles the name so that the function's name encodes the name and the function's type; this helps the linker link C++ object files together correctly. See http://en.wikipedia.org/wiki/Name_mangling#Name_mangling_in_C.2B.2B for more details.

Regards,

John Criswell

Got it, thanks. But in my pass, I use function name to locate. Can I disable mangling in clang?

Best,
Haopeng

Got it, thanks. But in my pass, I use function name to locate. Can I disable mangling in clang?

No, but you can probably fine a library that can either mangle the original name or demangle the name you're seeing in the LLVM bitcode.

As an FYI, on Unix, the c++filt program will demangle names (although sometimes you have to remove an extra '_' from the front of the name to get it to work).

Regards,

John Criswell

there's also a __cxa_demangle function in

http://llvm.org/svn/llvm-project/libcxxabi/trunk/include/cxxabi.h

Depending on what you want to achieve, one possibility is wrapping
function declaration with

   extern "C" void somefunction(void);

or

  extern "C" {
   void somefunction();
   int otherfunction();
  }

will make those functions have their name unmangled - this is used
when interfacing between C and C++ functions, since the C compiler
will NOT name-mangle.

This does however also affect some other aspects of the code e.g.
linker can't check parameter passing and you can't have more than one
function with the same name, with different function argument types -
like you can in C++ [in fact these are the two main reasons for using
name-mangling]. I can't remember if it also affects the ability for
example to handle exceptions from C++.

Thanks for all your help.

"__cxa_demangle" can decode the mangled function name as expected.

Another question is that given an unmangled function name, how to get the corresponding mangled name in llvm?

Best,
Haopeng

Haopeng Liu wrote:

Thanks for all your help.

"__cxa_demangle" can decode the mangled function name as expected.

Another question is that given an unmangled function name, how to get
the corresponding mangled name in llvm?

The unmangled function name is insufficient. Computing the mangled name requires the C++ context inside of which the function is defined (any namespaces, structs, etc.) as well as all the arguments. See the clang::MangleContext in clang, http://clang.llvm.org/doxygen/classclang_1_1MangleContext.html and the ABI document which defines this stuff in the first place: http://mentorembedded.github.io/cxx-abi/abi.html#mangling .

If what you have is guaranteed to be a plain function (not a template, constructor, operator, etc.) defined in the top-level (not inside any namespaces, structs, etc.) and the arguments are all going to be builtin types (int, char, etc.) then you can simplify the problem to:

   _Z <length of function name> <function name> <arguments...>

where arguments are b for bool, c for char, i for int, j for unsigned int, l for long, m for unsigned long. Prefix with 'P' to indicate a
"pointer to". For example, _Z3foojPv is "foo(unsigned int, void*)".

Nick