DebugInfo library

Hi,

I would like to extend DebugInfo library for the purpose of using it in AddressSanitizer/ThreadSanitizer run-time libraries.

Current interface is:

class DILineInfo {
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual DILineInfo getLineInfoForAddress(uint64_t address) = 0;
};

First, I would like to get function name associated with the address.
Then, I would like to get inlined frames as well.
Then, I would like to be able to get the same info about global variables.
Finally, I would like to receive explicit failure indicator.

I see the resulting interface along the lines of:

class DILineInfo {
const char *Name;
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual bool getLineInfoForCode(uint64_t pc, SmallVector<DILineInfo> *res) = 0;
virtual bool getLineInfoForData(uint64_t address, DILineInfo *res) = 0;
};

Any comments/suggestions?

The public DebugInfo interface was designed around a very specific use case, extracting file/line/column triples from an object file, so not much thought went into it. However, the only user is llvm-objdump so you can be really free in redesigning and extending it.

I'd like to keep the DWARF stuff an implementation detail so you'll likely end up with something like DIFunction, DIGlobalVariable, … which is then filled by the dwarf-implementation of DIContext. Some operations require traversing the DIE tree so it would be nice if the API only gives you what you ask for. Getting the File/Line/Column triple is a completely different operation in DWARF than fetching the function name, so it's better to keep it in separate objects.

One caveat of the DWARF parser in LLVM is that it doesn't understand relocations. This isn't a problem on OSX (it doesn't use them) and wasn't a problem on linux at the time I wrote it, but clang now emits DWARF that makes use of relocations so they have to be resolved before the buffers are passed to the DWARFContext. We should have the necessary stuff in LLVM's libObject, so it shouldn't be too hard.

- Ben

Hi,

I would like to extend DebugInfo library for the purpose of using it in AddressSanitizer/ThreadSanitizer run-time libraries.

Current interface is:

class DILineInfo {
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual DILineInfo getLineInfoForAddress(uint64_t address) = 0;
};

First, I would like to get function name associated with the address.
Then, I would like to get inlined frames as well.
Then, I would like to be able to get the same info about global variables.
Finally, I would like to receive explicit failure indicator.

I see the resulting interface along the lines of:

class DILineInfo {
const char *Name;
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual bool getLineInfoForCode(uint64_t pc, SmallVector *res) = 0;
virtual bool getLineInfoForData(uint64_t address, DILineInfo *res) = 0;
};

Any comments/suggestions?

The public DebugInfo interface was designed around a very specific use case, extracting file/line/column triples from an object file, so not much thought went into it. However, the only user is llvm-objdump so you can be really free in redesigning and extending it.

I’d like to keep the DWARF stuff an implementation detail so you’ll likely end up with something like DIFunction, DIGlobalVariable, … which is then filled by the dwarf-implementation of DIContext. Some operations require traversing the DIE tree so it would be nice if the API only gives you what you ask for. Getting the File/Line/Column triple is a completely different operation in DWARF than fetching the function name, so it’s better to keep it in separate objects.

Hi Ben,

I see the point of fetching only what user asks for. So it will require an enum with flags, a user will pass a set of flags regarding what he wants, and the function will return a set of flags describing what is actually fetched.

How strong do you feel about splitting it to DILineInfo/DIFunction/DIGlobalVariable? How now I need only symbol name for both functions and global vars. I would prefer to keep it simple for now.

One caveat of the DWARF parser in LLVM is that it doesn’t understand relocations. This isn’t a problem on OSX (it doesn’t use them) and wasn’t a problem on linux at the time I wrote it, but clang now emits DWARF that makes use of relocations so they have to be resolved before the buffers are passed to the DWARFContext. We should have the necessary stuff in LLVM’s libObject, so it shouldn’t be too hard.

I see. Perhaps it’s already fixed in lldb, lldb should work with llvm-generated binaries.

> Hi,
>
> I would like to extend DebugInfo library for the purpose of using it in AddressSanitizer/ThreadSanitizer run-time libraries.
>
> Current interface is:
>
> class DILineInfo {
> const char *FileName;
> uint32_t Line;
> uint32_t Column;
> ...
> };
>
> class DIContext {
> ...
> virtual DILineInfo getLineInfoForAddress(uint64_t address) = 0;
> };
>
> First, I would like to get function name associated with the address.
> Then, I would like to get inlined frames as well.
> Then, I would like to be able to get the same info about global variables.
> Finally, I would like to receive explicit failure indicator.
>
> I see the resulting interface along the lines of:
>
> class DILineInfo {
> const char *Name;
> const char *FileName;
> uint32_t Line;
> uint32_t Column;
> ...
> };
>
> class DIContext {
> ...
> virtual bool getLineInfoForCode(uint64_t pc, SmallVector<DILineInfo> *res) = 0;
> virtual bool getLineInfoForData(uint64_t address, DILineInfo *res) = 0;
> };
>
> Any comments/suggestions?
>

The public DebugInfo interface was designed around a very specific use case, extracting file/line/column triples from an object file, so not much thought went into it. However, the only user is llvm-objdump so you can be really free in redesigning and extending it.

I'd like to keep the DWARF stuff an implementation detail so you'll likely end up with something like DIFunction, DIGlobalVariable, … which is then filled by the dwarf-implementation of DIContext. Some operations require traversing the DIE tree so it would be nice if the API only gives you what you ask for. Getting the File/Line/Column triple is a completely different operation in DWARF than fetching the function name, so it's better to keep it in separate objects.

Hi Ben,

I see the point of fetching only what user asks for. So it will require an enum with flags, a user will pass a set of flags regarding what he wants, and the function will return a set of flags describing what is actually fetched.

How strong do you feel about splitting it to DILineInfo/DIFunction/DIGlobalVariable? How now I need only symbol name for both functions and global vars. I would prefer to keep it simple for now.

Not particularly strong. Feel free to use whatever fits best.

One caveat of the DWARF parser in LLVM is that it doesn't understand relocations. This isn't a problem on OSX (it doesn't use them) and wasn't a problem on linux at the time I wrote it, but clang now emits DWARF that makes use of relocations so they have to be resolved before the buffers are passed to the DWARFContext. We should have the necessary stuff in LLVM's libObject, so it shouldn't be too hard.

I see. Perhaps it's already fixed in lldb, lldb should work with llvm-generated binaries.

If lldb can extract function names on linux they are resolving relocations properly. It's not a very hard problem just annoying to put all the pieces together.

- Ben

On Sun, May 6, 2012 at 8:36 PM, Benjamin Kramer > > Hi,

I would like to extend DebugInfo library for the purpose of using it in AddressSanitizer/ThreadSanitizer run-time libraries.

Current interface is:

class DILineInfo {
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual DILineInfo getLineInfoForAddress(uint64_t address) = 0;
};

First, I would like to get function name associated with the address.
Then, I would like to get inlined frames as well.
Then, I would like to be able to get the same info about global variables.
Finally, I would like to receive explicit failure indicator.

I see the resulting interface along the lines of:

class DILineInfo {
const char *Name;
const char *FileName;
uint32_t Line;
uint32_t Column;

};

class DIContext {

virtual bool getLineInfoForCode(uint64_t pc, SmallVector *res) = 0;
virtual bool getLineInfoForData(uint64_t address, DILineInfo *res) = 0;
};

Any comments/suggestions?

The public DebugInfo interface was designed around a very specific use case, extracting file/line/column triples from an object file, so not much thought went into it. However, the only user is llvm-objdump so you can be really free in redesigning and extending it.

I’d like to keep the DWARF stuff an implementation detail so you’ll likely end up with something like DIFunction, DIGlobalVariable, … which is then filled by the dwarf-implementation of DIContext. Some operations require traversing the DIE tree so it would be nice if the API only gives you what you ask for. Getting the File/Line/Column triple is a completely different operation in DWARF than fetching the function name, so it’s better to keep it in separate objects.

Hi Ben,

I see the point of fetching only what user asks for. So it will require an enum with flags, a user will pass a set of flags regarding what he wants, and the function will return a set of flags describing what is actually fetched.

How strong do you feel about splitting it to DILineInfo/DIFunction/DIGlobalVariable? How now I need only symbol name for both functions and global vars. I would prefer to keep it simple for now.

Not particularly strong. Feel free to use whatever fits best.

One caveat of the DWARF parser in LLVM is that it doesn’t understand relocations. This isn’t a problem on OSX (it doesn’t use them) and wasn’t a problem on linux at the time I wrote it, but clang now emits DWARF that makes use of relocations so they have to be resolved before the buffers are passed to the DWARFContext. We should have the necessary stuff in LLVM’s libObject, so it shouldn’t be too hard.

I see. Perhaps it’s already fixed in lldb, lldb should work with llvm-generated binaries.

If lldb can extract function names on linux they are resolving relocations properly. It’s not a very hard problem just annoying to put all the pieces together.

  • Ben

OK, thanks. I will send the change list this or next week.