C++ demangling in LLVM

Hello!

We want to implement in-process symbolizer for {Address,Thread}Sanitizer testing tools that would be based on LLVM libraries.
I’ve noticed that llvm-nm (as well as other tools) doesn’t demangle C++ names. Is it true, that LLVM doesn’t have the code that is capable
of that, and if yes, are there any plans to add it?
Depending on something like libiberty.a doesn’t seem like a good or portable solution.

There's code to demangle names in libcxxabi.

-- Marshall

Marshall Clow Idio Software <mailto:mclow.lists@gmail.com>

A.D. 1517: Martin Luther nails his 95 Theses to the church door and is promptly moderated down to (-1, Flamebait).
        -- Yu Suzuki

Hello!

We want to implement in-process symbolizer for {Address,Thread}Sanitizer testing tools that would be based on LLVM libraries.
I’ve noticed that llvm-nm (as well as other tools) doesn’t demangle C++ names. Is it true, that LLVM doesn’t have the code that is capable
of that, and if yes, are there any plans to add it?
Depending on something like libiberty.a doesn’t seem like a good or portable solution.

There’s code to demangle names in libcxxabi.

Indeed, thanks! I’ll take a closer look at it.

Yes, LLVM currently has no C++ demangler, and it needs one. Although I
have no idea where it should live. It would be nice if it could live
in clang next to the mangler, but clang doesn't even need a demangler.
llvm tools, lld, and compiler-rt do.

- Michael Spencer

llvm/Support?

It’s not that clear how libcxxabi could be used in llvm tools, as IIUC this library is built independently.
The demangler implementation there is 10 KLOC which are rather far from LLVM style.

In the same way that the core LLVM libraries have support routines for DWARF, I think that both mangling and demangling should be provided as well. I suspect that the ‘Support’ library is the best we have, although eventually we need to split this library up a bit. That’s not really your problem though.

The bigger problem is that we don’t have any good way of sharing code between runtime libraries (such as libcxxabi, sanitizer runtimes, etc) and LLVM.

One somewhat interesting question, would the APIs exposed by libcxxabi be sufficient for the sanitizer runtimes?

AFAICT, the only other way forward is to:

  1. Move/re-implement the demangling (and hopefully the mangling as well) into llvm/Support, following proper style and writing proper unittests as we go.
  2. Write my object-file-scrubber so that we can statically link llvm/Support into a runtime without collision issues.
  3. Potentially split up llvm/Support (and other libraries) enough to have depending upon it not burden or bloat the runtime libraries unnecessarily.

All of these options seem… moderately painful. My inclination is to start paving the way for better code sharing in runtime libraries sooner rather than later. Other thoughts? Chris?

In the same way that the core LLVM libraries have support routines for DWARF, I think that both mangling and demangling should be provided as well. I suspect that the ‘Support’ library is the best we have, although eventually we need to split this library up a bit. That’s not really your problem though.

The bigger problem is that we don’t have any good way of sharing code between runtime libraries (such as libcxxabi, sanitizer runtimes, etc) and LLVM.

Yes, surely I want this to happen and would be happy to help if you (or someone else) give some advice or guidance.

One somewhat interesting question, would the APIs exposed by libcxxabi be sufficient for the sanitizer runtimes?

How can we use libcxxabi anyway? I mean, is it shipped with compiler so that we can make Clang driver tell to link user program with
libcxxabi in the same way we tell it to link with sanitizer runtimes if flags -fwhatever-sanitizer is present?

In the same way that the core LLVM libraries have support routines for
DWARF, I think that both mangling and demangling should be provided as
well. I suspect that the 'Support' library is the best we have, although
eventually we need to split this library up a bit. That's not really your
problem though.

The bigger problem is that we don't have any good way of sharing code
between runtime libraries (such as libcxxabi, sanitizer runtimes, etc) and
LLVM.

Yes, surely I want this to happen and would be happy to help if you (or
someone else) give some advice or guidance.

Did you see my proposal to llvmdev some time ago about how to do this? If
you have thoughts about that, we should move the discussion to that thread.

One somewhat interesting question, would the APIs exposed by libcxxabi be
sufficient for the sanitizer runtimes?

How can we use libcxxabi anyway? I mean, is it shipped with compiler so
that we can make Clang driver tell to link user program with
libcxxabi in the same way we tell it to link with sanitizer runtimes if
flags -fwhatever-sanitizer is present?

Essentially, it could be. It's more complicated than just that though, so I
was just curious if it would work.

Okay.

libcxxabi seems to be mostly irrelevant for us. It has a demangler, which can be useful for sanitizer and llvm tools (but not that essential, of course) … and that’s it.

Isn’t __cxa_demangle already available to all libc++/libstdc++ client ? Linking on one if this library should be enough to get it.

In the same way that the core LLVM libraries have support routines for DWARF, I think that both mangling and demangling should be provided as well.

How would LLVM provide support for mangling? And what tools actually need it? I also wonder if we need more from a demangler than just a string. I know linker diagnostics would benefit from a deeper understanding of the name without having to parse C++ decls.

I suspect that the ‘Support’ library is the best we have, although eventually we need to split this library up a bit. That’s not really your problem though.

I don’t see anywhere to split Support. We already merged System and Support because of the circular deps. However I agree that demangling can be in another library.

Somewhat off the cuff, but I think it would be nice if commandline object
inspection tools could query for a particular symbol when the user
specified it in the C++, unmangled form.

Generally, it seems useful to unify the mangling and unmangling code if
only so that we constantly round-trip test both halves and don't end up
with divergences.

Hi Chandler,

    In the same way that the core LLVM libraries have support routines for
    DWARF, I think that both mangling and demangling should be provided as well.

    How would LLVM provide support for mangling? And what tools actually need
    it? I also wonder if we need more from a demangler than just a string. I
    know linker diagnostics would benefit from a deeper understanding of the
    name without having to parse C++ decls.

Somewhat off the cuff, but I think it would be nice if commandline object
inspection tools could query for a particular symbol when the user specified it
in the C++, unmangled form.

this would be nice for Ada too, which also does name mangling (differently to
C++ of course) and presumably lots of other languages as well.

Ciao, Duncan.

Hello!

We want to implement in-process symbolizer for {Address,Thread}Sanitizer
testing tools that would be based on LLVM libraries.
I've noticed that llvm-nm (as well as other tools) doesn't demangle C++
names. Is it true, that LLVM doesn't have the code that is capable
of that, and if yes, are there any plans to add it?
Depending on something like libiberty.a doesn't seem like a good or portable
solution.

--
Alexey Samsonov, MSK

Yes, LLVM currently has no C++ demangler, and it needs one. Although I
have no idea where it should live. It would be nice if it could live
in clang next to the mangler, but clang doesn't even need a demangler.
llvm tools, lld, and compiler-rt do.

libc++abi provides a full C++ demangler, along with an extended API that provides a tree-based representation of the demangled name (implemented for LLDB).
http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/include/cxa_demangle.h?revision=HEAD&view=markup

Was there any resolution about if bringing this into the LLVM Support directory should be done or not? I have a need of being able to use this, not just for demangling, but also to verify correctness of a mangled function, and can only rely on LLVM core.

Thanks,
Micah

Micah,

Why can't you just call the standard __cxa_demangle function?

-Chris

Three reasons.
1) I need to modify the code to support extensions to the standard demangler.
2) GCC's version is GPL v3.
3) Need windows support.

Micah

Three reasons.
1) I need to modify the code to support extensions to the standard demangler.
2) GCC's version is GPL v3.

And?

BTW, there is BSD-licensed implementation of __cxa_demangle in libcxxrt

From: Konstantin Tokarev [mailto:annulen@yandex.ru]
Sent: Wednesday, August 15, 2012 9:10 AM
To: Villmow, Micah
Cc: Chris Lattner; Dmitry Vyukov; LLVM Developers Mailing List
Subject: Re: [LLVMdev] C++ demangling in LLVM

> Three reasons.
> 1) I need to modify the code to support extensions to the standard
demangler.
> 2) GCC's version is GPL v3.

And?

[Villmow, Micah] No GPL code allowed because of its viral nature.

BTW, there is BSD-licensed implementation of __cxa_demangle in
libcxxrt

[Villmow, Micah] Thanks, I didn't know about this one!
I currently am using the one from here:
http://libcxxabi.llvm.org/

Well, if you just call it from your code you are not affected by this aspect of GPL.
To implement your extensions you can create wrapper function.