"Cannot fine DIE"

I recently started getting this error when I try to debug my LLVM-compiled program in GDB:

Dwarf Error: Cannot find DIE at 0x16769 referenced from DIE at 0x1713c [in module /home/talin/Projects/tart/build-eclipse/test/stdlib/ReflectionTest]

I’m not sure if it’s something I did or not. Is there any way to track down the cause of this error? The hex addresses in the error message are meaningless to me.

There is not much info here. The error says one debug info entry (DIE) is referring to another debug info entry that does not exist. This usually indicates that dwarf generator in llvm codegen knew about the other DIE but some how gdb does not find the DIE. May be the code generator did not emit the DIE ? If yes then why ? Otherwise, the DIE was dropped somewhere after compiler generated code.

I recently started getting this error when I try to debug my LLVM-compiled program in GDB:

Dwarf Error: Cannot find DIE at 0x16769 referenced from DIE at 0x1713c [in module /home/talin/Projects/tart/build-eclipse/test/stdlib/ReflectionTest]

I’m not sure if it’s something I did or not. Is there any way to track down the cause of this error? The hex addresses in the error message are meaningless to me.

There is not much info here. The error says one debug info entry (DIE) is referring to another debug info entry that does not exist. This usually indicates that dwarf generator in llvm codegen knew about the other DIE but some how gdb does not find the DIE. May be the code generator did not emit the DIE ? If yes then why ? Otherwise, the DIE was dropped somewhere after compiler generated code.

I’ve discovered a little bit more, but not enough to solve the problem.

Here’s what my code looks like for setting the debug location (some details removed for clarity):

void CodeGenerator::setDebugLocation(const SourceLocation & loc) {
if (loc != dbgLocation_ && dbgContext_.isScope()) {
dbgLocation_ = loc;
if (loc.file == NULL) {
builder_.SetCurrentDebugLocation(llvm::DebugLoc());
} else if (loc.file == module_->moduleSource()) {
TokenPosition pos = tokenPosition(loc);
// ** Comment out this line and the DIE errors disappear **
builder_.SetCurrentDebugLocation(
DebugLoc::get(pos.beginLine, pos.beginCol, dbgContext_));
}
}
}

As noted in the comment above, if I comment out the line that calls builder_.SetCurrentDebugLocation(), then gdb no longer reports not being able to find the DIE. Of course, I don’t get any source-line debugging information either.

Note that even in this case, I’m still generating tons of DWARF info for data types, files, functions, and so on. This all apparently works (or at least, it doesn’t seem to generate this particular problem). It’s only the generating of source line information that causes the error. This is fortunate, since as you can see the code is quite trivial, whereas I was afraid that if the problem was in the data type code, it would take forever to locate the problem as that code is pretty complicated.

As far as why this code is causing the error, I can only guess that I am calling it wrong somehow. But I can’t think what I could do differently from what I am doing already.

I recently started getting this error when I try to debug my LLVM-compiled program in GDB:

Dwarf Error: Cannot find DIE at 0x16769 referenced from DIE at 0x1713c [in module /home/talin/Projects/tart/build-eclipse/test/stdlib/ReflectionTest]

I’m not sure if it’s something I did or not. Is there any way to track down the cause of this error? The hex addresses in the error message are meaningless to me.

There is not much info here. The error says one debug info entry (DIE) is referring to another debug info entry that does not exist. This usually indicates that dwarf generator in llvm codegen knew about the other DIE but some how gdb does not find the DIE. May be the code generator did not emit the DIE ? If yes then why ? Otherwise, the DIE was dropped somewhere after compiler generated code.

I’ve discovered a little bit more, but not enough to solve the problem.

Here’s what my code looks like for setting the debug location (some details removed for clarity):

void CodeGenerator::setDebugLocation(const SourceLocation & loc) {
if (loc != dbgLocation_ && dbgContext_.isScope()) {
dbgLocation_ = loc;
if (loc.file == NULL) {
builder_.SetCurrentDebugLocation(llvm::DebugLoc());
} else if (loc.file == module_->moduleSource()) {
TokenPosition pos = tokenPosition(loc);
// ** Comment out this line and the DIE errors disappear **
builder_.SetCurrentDebugLocation(
DebugLoc::get(pos.beginLine, pos.beginCol, dbgContext_));
}
}
}

As noted in the comment above, if I comment out the line that calls builder_.SetCurrentDebugLocation(), then gdb no longer reports not being able to find the DIE. Of course, I don’t get any source-line debugging information either.

Note that even in this case, I’m still generating tons of DWARF info for data types, files, functions, and so on. This all apparently works (or at least, it doesn’t seem to generate this particular problem). It’s only the generating of source line information that causes the error. This is fortunate, since as you can see the code is quite trivial, whereas I was afraid that if the problem was in the data type code, it would take forever to locate the problem as that code is pretty complicated.

As far as why this code is causing the error, I can only guess that I am calling it wrong somehow. But I can’t think what I could do differently from what I am doing already.

OK some progress on this: Changing the ‘scope’ argument passed to DebugLoc::get() - from a DICompileUnit to a DIFile (I’m not sure which is right, the docs and comments aren’t real clear on this) - gets rid of the “Can’t find DIE” error. Instead I get:

“Line number -1 out of range;”

…when I try to examine a stack frame. Which is strange, because know my line numbers are not -1. There’s even an assert for that. And the comments in my generated assembly language look perfectly valid to me:

I recently started getting this error when I try to debug my LLVM-compiled program in GDB:

Dwarf Error: Cannot find DIE at 0x16769 referenced from DIE at 0x1713c [in module /home/talin/Projects/tart/build-eclipse/test/stdlib/ReflectionTest]

I’m not sure if it’s something I did or not. Is there any way to track down the cause of this error? The hex addresses in the error message are meaningless to me.

There is not much info here. The error says one debug info entry (DIE) is referring to another debug info entry that does not exist. This usually indicates that dwarf generator in llvm codegen knew about the other DIE but some how gdb does not find the DIE. May be the code generator did not emit the DIE ? If yes then why ? Otherwise, the DIE was dropped somewhere after compiler generated code.

I’ve discovered a little bit more, but not enough to solve the problem.

Here’s what my code looks like for setting the debug location (some details removed for clarity):

void CodeGenerator::setDebugLocation(const SourceLocation & loc) {
if (loc != dbgLocation_ && dbgContext_.isScope()) {
dbgLocation_ = loc;
if (loc.file == NULL) {
builder_.SetCurrentDebugLocation(llvm::DebugLoc());
} else if (loc.file == module_->moduleSource()) {
TokenPosition pos = tokenPosition(loc);
// ** Comment out this line and the DIE errors disappear **
builder_.SetCurrentDebugLocation(
DebugLoc::get(pos.beginLine, pos.beginCol, dbgContext_));
}
}
}

As noted in the comment above, if I comment out the line that calls builder_.SetCurrentDebugLocation(), then gdb no longer reports not being able to find the DIE. Of course, I don’t get any source-line debugging information either.

Note that even in this case, I’m still generating tons of DWARF info for data types, files, functions, and so on. This all apparently works (or at least, it doesn’t seem to generate this particular problem). It’s only the generating of source line information that causes the error. This is fortunate, since as you can see the code is quite trivial, whereas I was afraid that if the problem was in the data type code, it would take forever to locate the problem as that code is pretty complicated.

As far as why this code is causing the error, I can only guess that I am calling it wrong somehow. But I can’t think what I could do differently from what I am doing already.

OK some progress on this: Changing the ‘scope’ argument passed to DebugLoc::get() - from a DICompileUnit to a DIFile (I’m not sure which is right, the docs and comments aren’t real clear on this) - gets rid of the “Can’t find DIE” error. Instead I get:

“Line number -1 out of range;”

…when I try to examine a stack frame. Which is strange, because know my line numbers are not -1. There’s even an assert for that. And the comments in my generated assembly language look perfectly valid to me:

Ltmp26:

subl $8, %esp ## Array.tart:103:11[ Array.tart:103:11 ]

Ltmp27: ## Array.tart:103:11[ Array.tart:103:11 ]

movl $0, 4(%esp) ## Array.tart:103:11[ Array.tart:103:11 ]

movl $16, (%esp) ## Array.tart:103:11[ Array.tart:103:11 ]

call _malloc ## Array.tart:103:11[ Array.tart:103:11 ]


– Talin

Well, I figured out why the line numbers were getting set to -1: My “add_custom_command” directive in my CMake file which was supposed to run dsymutil was silently failing:

add_custom_command(
OUTPUT “${EXE_FILE}.dSYM/Contents/Resources/DWARF/${EXE_FILE}”
COMMAND ${DSYMUTIL} “${EXE_FILE}”
MAIN_DEPENDENCY “${EXE_FILE}”
COMMENT “Generating debug symbols for ${EXE_FILE}”)

I don’t know why, but this directive does nothing and prints nothing. In any case, that’s not an LLVM problem so I’ll stop talking about it.

However, this leads to a new problem: Now when I manually run dsymutil and attempt to debug my executable, I get this:

DW_FORM_strp pointing outside of .debug_str section [in module /Users/talin/Projects/tart/build-eclipse/test/stdlib/ArrayListTest.dSYM/Contents/Resources/DWARF/ArrayListTest]

Once again, I have no idea what this means or how to go about debugging it. This is my biggest frustration with DIFactory - there’s absolutely no way to verify that the DWARF debugging information that I’ve emitted into my module is correct or even sensible. The only way to test it is to try and debug it with gdb, but all that will tell you is that something failed - it won’t tell you where or what. It’s not so much like looking for a needle in a haystack - more like looking for a particular needle in a needlestack.

You can also try running "dwarfdump --verify" on your dsymutil file (not on the dSYM bundle, but on the file inside the bundle). It sometimes gives a bit more information. Also simply using "dwarfdump -a" can be helpful to see wrong references (e.g., if the type info is structurally invalid then dwarfdump may simply stop dumping at that point, so you know where the error is).

Jonas

Once again, I have no idea what this means or how to go about
debugging it.
This is my biggest frustration with DIFactory - there’s absolutely
no way to
verify that the DWARF debugging information that I’ve emitted into
my module
is correct or even sensible. The only way to test it is to try and
debug it
with gdb, but all that will tell you is that something failed - it
won’t
tell you where or what. It’s not so much like looking for a needle
in a
haystack - more like looking for a particular needle in a needlestack.

You can also try running “dwarfdump --verify” on your dsymutil file
(not on the dSYM bundle, but on the file inside the bundle). It
sometimes gives a bit more information. Also simply using “dwarfdump -
a” can be helpful to see wrong references (e.g., if the type info is
structurally invalid then dwarfdump may simply stop dumping at that
point, so you know where the error is).

Ah OK thanks, that’s helpful.

I tried what you suggested, and it prints out about 4000 lines and then segfaults, The last lines that it prints out are:

.debug_frame contents:

0x00000000: CIE

length: 0x00000010

CIE_id: 0xffffffff

version: 0x01

augmentation: “”

code_align: 1

data_align: -4

ra_register: 0x08

DW_CFA_def_cfa (4 (esp), 4)

DW_CFA_offset (8 (eip), 0)

DW_CFA_nop

DW_CFA_nop

Instructions: Init State: CFA=esp+4 eip=[CFA]

0x00000014: FDE

length: 0x00000028

CIE_pointer: 0x00000000

I’m not sure how the contents of this structure relate to the various LLVM API calls used in creating debugging info…

Once again, I have no idea what this means or how to go about
debugging it.
This is my biggest frustration with DIFactory - there’s absolutely
no way to
verify that the DWARF debugging information that I’ve emitted into
my module
is correct or even sensible. The only way to test it is to try and
debug it
with gdb, but all that will tell you is that something failed - it
won’t
tell you where or what. It’s not so much like looking for a needle
in a
haystack - more like looking for a particular needle in a needlestack.

You can also try running “dwarfdump --verify” on your dsymutil file
(not on the dSYM bundle, but on the file inside the bundle). It
sometimes gives a bit more information. Also simply using “dwarfdump -
a” can be helpful to see wrong references (e.g., if the type info is
structurally invalid then dwarfdump may simply stop dumping at that
point, so you know where the error is).

Ah OK thanks, that’s helpful.

I tried what you suggested, and it prints out about 4000 lines and then segfaults, The last lines that it prints out are:

.debug_frame contents:

0x00000000: CIE

length: 0x00000010

CIE_id: 0xffffffff

version: 0x01

augmentation: “”

code_align: 1

data_align: -4

ra_register: 0x08

DW_CFA_def_cfa (4 (esp), 4)

DW_CFA_offset (8 (eip), 0)

DW_CFA_nop

DW_CFA_nop

Instructions: Init State: CFA=esp+4 eip=[CFA]

0x00000014: FDE

length: 0x00000028

CIE_pointer: 0x00000000

I’m not sure how the contents of this structure relate to the various LLVM API calls used in creating debugging info…

OK I figured it out, something is wrong with my descriptors for local variables…

Thanks again! :slight_smile:

Hello,

May I ask what the issue was and how you solved it?

I get the same error just by emitting the compile unit di.

My code to create the compile unit di:
    di.createCompileUnit(DW_LANG_Poke,
            file == NULL ? "_nofile_" : file->getName(),
            file == NULL ? "" : file->getParent().getPath(),
            POKE_NAME_AND_VERSION,
            false, "", 0);

This is the only debug info generation I do. Without it gdb will accept the
resulting executable without complaints, but with it I get the "DW_FORM_strp
pointing outside of .debug_str section" error same as you did.

The executable is compiled from the following files by first running my
compiler, then llc on the output and finally clang on the assembler code:
llc -filetype=asm test2.pk.ll && clang *.s

Source code in my language: http://emil.djupfeldt.se/poke/test2.pk
Generated llvm code: http://emil.djupfeldt.se/poke/test2.pk.ll
Generated x86 assembler code: http://emil.djupfeldt.se/poke/test2.pk.s

regards,
Emil Djupfeldt