RFC: Move parts of llvm-symbolizer tool implementation to LLVMSymbolize library

vonosmas · October 20, 2015, 8:54pm

Hi,

We have a lot of non-trivial logic accumulated in the
implementation of llvm-symbolizer tool (tools/llvm-symbolizer/LLVMSymbolize.{h,cpp}), for instance:

dynamic dispatch between DWARF and PDB debug info;
building address->symbol_name mapping from object file (with special cases for PowerPC function descriptor section, and COFF export tables);
finding debug info stored in separate files (.dSYM files on Darwin, ELF .gnu_debuglink section, etc.);
demangling (with platform-specific implementations for Windows and Unix).

I propose to move this code into a separate library LLVMSymbolize (stored under lib/DebugInfo/Symbolize), and make llvm-symbolizer a short and simple tool using it. This would allow to:

implement in-process symbolized stack trace printers (for the cases when it’s possible to link in a bunch of LLVM libraries into the executable).
easily write more tools that can make use of symbolized code locations, such as coverage data visualizers.
(at least sometimes) write unit tests instead of testing functionality by running “llvm-symbolizer” executable on pre-built executables checked in repository.

Any comments/objections?

Rafael_Avila_de_Espi · October 21, 2015, 2:35pm

I would say it is worth it if someone is actually planning on using
the library in something else.

Moving the code "just in case" or to create unit tests is not a good
reason IMHO.

echristo · October 21, 2015, 6:34pm

To create unit tests is a pretty good reason IMO

That said, I’d be a fan of trying to encapsulate all of this behind an interface. I like that most of the tools are exceptionally light weight and it makes it much more obvious what’s “wrapper” versus “functionality” in something like llvm-symbolize. That said, I’ll be interested to see the library design

-eric

vonosmas · October 21, 2015, 7:17pm

We have out-of-tree implementation of llvm-symbolizer-as-a-library, and I
still hope to upstream it one day (unfortunately, its build process is
really complicated).

Mike can probably comment on his plans for using symbolization in
coverage-related tools.

To create unit tests is a pretty good reason IMO

That said, I'd be a fan of trying to encapsulate all of this behind an
interface. I like that most of the tools are exceptionally light weight and
it makes it much more obvious what's "wrapper" versus "functionality" in
something like llvm-symbolize. That said, I'll be interested to see the
library design

Do you suggest to design it upfront, or you're fine with moving the
existing code first, and gradually updating the interface afterwards?

echristo · October 21, 2015, 7:59pm

Don’t know, what’s the interface look like now? Were you just going to copy the LLVMSymbolize.[cpp,h] into the directory? That should be fine I guess. I’d like to see the general ownership of objects separated out fairly explicitly from the rest of the code.

-eric

vonosmas · October 21, 2015, 8:21pm

We have out-of-tree implementation of llvm-symbolizer-as-a-library, and I
still hope to upstream it one day (unfortunately, its build process is
really complicated).

Mike can probably comment on his plans for using symbolization in
coverage-related tools.

To create unit tests is a pretty good reason IMO

That said, I'd be a fan of trying to encapsulate all of this behind an
interface. I like that most of the tools are exceptionally light weight and
it makes it much more obvious what's "wrapper" versus "functionality" in
something like llvm-symbolize. That said, I'll be interested to see the
library design

Do you suggest to design it upfront, or you're fine with moving the
existing code first, and gradually updating the interface afterwards?

Don't know, what's the interface look like now? Were you just going to
copy the LLVMSymbolize.[cpp,h] into the directory?

For a start, yes.

That should be fine I guess. I'd like to see the general ownership of
objects separated out fairly explicitly from the rest of the code.

I'm not sure what you mean by this.

echristo · October 21, 2015, 8:24pm

OK.

Is LLVMSymbolizer owning all of the files the right choice, or what was convenient at the time?

-eric

vonosmas · October 21, 2015, 9:18pm

We have out-of-tree implementation of llvm-symbolizer-as-a-library, and
I still hope to upstream it one day (unfortunately, its build process is
really complicated).

Mike can probably comment on his plans for using symbolization in
coverage-related tools.

To create unit tests is a pretty good reason IMO

That said, I'd be a fan of trying to encapsulate all of this behind an
interface. I like that most of the tools are exceptionally light weight and
it makes it much more obvious what's "wrapper" versus "functionality" in
something like llvm-symbolize. That said, I'll be interested to see the
library design

Do you suggest to design it upfront, or you're fine with moving the
existing code first, and gradually updating the interface afterwards?

Don't know, what's the interface look like now? Were you just going
to copy the LLVMSymbolize.[cpp,h] into the directory?

For a start, yes.

OK.

That should be fine I guess. I'd like to see the general ownership of
objects separated out fairly explicitly from the rest of the code.

I'm not sure what you mean by this.

Is LLVMSymbolizer owning all of the files the right choice, or what was
convenient at the time?

Yeah, I think it's correct to have LLVMSymbolizer own parsed object files
(although we might factor out a separate "cache" object
that would be responsible for it).

echristo · October 21, 2015, 9:19pm

OK. Sounds good to me. It can also be changed in the future if we decide to go a different way.

-eric

vonosmas · October 22, 2015, 10:24pm

We have out-of-tree implementation of llvm-symbolizer-as-a-library,
and I still hope to upstream it one day (unfortunately, its build process
is really complicated).

Mike can probably comment on his plans for using symbolization in
coverage-related tools.

To create unit tests is a pretty good reason IMO

That said, I'd be a fan of trying to encapsulate all of this behind
an interface. I like that most of the tools are exceptionally light weight
and it makes it much more obvious what's "wrapper" versus "functionality"
in something like llvm-symbolize. That said, I'll be interested to see the
library design

Do you suggest to design it upfront, or you're fine with moving the
existing code first, and gradually updating the interface afterwards?

Don't know, what's the interface look like now? Were you just going
to copy the LLVMSymbolize.[cpp,h] into the directory?

For a start, yes.

OK.

That should be fine I guess. I'd like to see the general ownership of
objects separated out fairly explicitly from the rest of the code.

I'm not sure what you mean by this.

Is LLVMSymbolizer owning all of the files the right choice, or what was
convenient at the time?

Yeah, I think it's correct to have LLVMSymbolizer own parsed object files
(although we might factor out a separate "cache" object
that would be responsible for it).

OK. Sounds good to me. It can also be changed in the future if we decide
to go a different way.

See ⚙ D13998 Move parts of llvm-symbolizer tool into LLVMSymbolize library.

Topic		Replies	Views
Llvm-symbolizer has gotten extremely slow LLVM Project	28	1703	July 29, 2023
Reimplementing Darwin's dsymutil as an lld helper LLVM Dev List Archives	15	189	November 18, 2015
RFC: How can AddressSanitizer, ThreadSanitizer, and similar runtime libraries leverage shared library code? LLVM Dev List Archives	25	168	August 29, 2012
[Proposal][Debuginfo] dsymutil-like tool for ELF. LLVM Dev List Archives	46	451	November 2, 2020
RFC: Moving debug info parsing out of process LLDB	20	175	March 6, 2019

RFC: Move parts of llvm-symbolizer tool implementation to LLVMSymbolize library

Related topics