More DIFactory questions

Here are some issues that I am unclear about. What would be great is if the answers could be incorporated into the comments and documentation for DIFactory and DebugInfo.h:

  1. What types of DIScope are valid arguments for DebugLoc::get()? The method takes an MDNode* argument, so looking at the function signature is no help.

For example, DIFile is a subtype of DIScope, however looking at DwarfDebug::recordSourceLine, I see that DIFile scopes are not valid for source lines.

  1. What is the proper DIScope for executable code that is not part of a source-level function? For example, suppose I have code that is used to generate the initialization expression for a static variable. A similar example would be a default constructor, or any of the various implicit casting functions used when dealing with dynamic types. Now, although that code is in an LLVM function, that function was synthesized by the compiler and was never defined in any source file, so even if I were to create a DISubprogram scope for it, there’s no valid source location information for it.

If DIFile were allowed as a scope, then the issue would be straightforward, I would just declare the scope to be the source file.

  1. Similarly, suppose I have synthesized code that is taken from a different module. An example is a static variable contained within a template defined in a different source file. Again, if DIFile were allowed as a scope, then I’d simply use the file which originally defined the template. I don’t think I can use DICompileUnit, since (I’m guessing) only one of those are allowed per module.

(Side note: I’ve never understood the relationship between DICompileUnit and DIFile. I’m guessing, however, that DICompileUnit acts like a container for all of the DIDescriptors within a module - that is, even if the DIDescriptor is referring to an external symbol, the compile unit for that descriptor is the module containing the reference, not the module of the target of the reference. DIFile, on the other hand, is I think the target. If this is not the case, then why have both?)

If my assumptions are correct, then, you can’t have more than one DICompileUnit per module, and the only way for code that was inlined from another module to indicate where it came from is to have a scope which is a DISubprogram, where that subprogram’s DIFile is the file from where the code was defined. However, if the code didn’t come from a function but was synthesized by the compiler, then I’m not sure how to generate a valid DISubprogram.

  1. What is the meaning of the “inlinedAt” argument for DebugLoc::get()? Does it mean the location where the inlined code was defined, or the location where it was expanded?

1) What types of DIScope are valid arguments for DebugLoc::get()? The method
takes an MDNode* argument, so looking at the function signature is no help.
For example, DIFile is a subtype of DIScope, however looking
at DwarfDebug::recordSourceLine, I see that DIFile scopes are not valid for
source lines.

Hi Talin,

I saw the same thing and I think it's an error in the code. There
shouldn't be any MDNode as arguments to DIFactory or DebugLoc or
anything related to creating debug information, otherwise, there is no
point in having those helpers. DIDescriptor has methods (isType,
isVariable, etc) to help, but MDNode doesn't (nor should).

2) What is the proper DIScope for executable code that is not part of a
source-level function? For example, suppose I have code that is used to
generate the initialization expression for a static variable. A similar
example would be a default constructor, or any of the various implicit
casting functions used when dealing with dynamic types. Now, although that
code is in an LLVM function, that function was synthesized by the compiler
and was never defined in any source file, so even if I were to create a
DISubprogram scope for it, there's no valid source location information for
it.
If DIFile were allowed as a scope, then the issue would be straightforward,
I would just declare the scope to be the source file.

I'm always passing DIDescriptor to DebugLoc::get(), which can be any
of DIFile, DISubprogram or DILexicalBlock...

Probably I'm bypassing the check, since the type now is DIDescriptor,
but I agree with you that DIFile should be accepted.

Another problem of not having real polymorphism... :confused:

If my assumptions are correct, then, you can't have more than one
DICompileUnit per module,

AFAIK, that is correct. Clang always set the "main_unit" to true.

and the only way for code that was inlined from
another module to indicate where it came from is to have a scope which is a
DISubprogram, where that subprogram's DIFile is the file from where the code
was defined. However, if the code didn't come from a function but was
synthesized by the compiler, then I'm not sure how to generate a valid
DISubprogram.

I don't understand why are you generating Dwarf symbols for compiler
generated code. If you can't represent that instruction in a source
line, why would you want to generate line information for it?

On "next", debuggers normally step to the next instruction with line
information that is not the same as the current, so there is no
problem in not issuing line info for compiler generated info. Also, on
"step", debuggers tend to stay in the same line until another
instruction with debug information is being processed, again,
unaffected by instructions without debug info.

The only thing I can think of doing with those is to print some
internal variables (VTT, TypeInfo, etc), and for that you can always
get the line info of the declaration (of the class, template, or
whatever).

$ cat foo.h
int bar() { return 42; }
$ cat foo.c
#include "foo.h"
void foo() { bar(); }

$ cat foo2.c
void foo2() { bar(); }

$ clang -c -g foo.c

Here one compile unit is created for foo.c. This compile unit will have to DIFile nodes, one for foo.c and one for foo.h

$ clang -c -g foo2.c

Here one compile unit is created for foo2.c with one DIFile node for foo2.c.
However, if you do

$ clang -c -g foo.c -o foo.o
$ clang -c -g foo2.c -o foo2.o
$ llvm-ld foo.o foo2.o

then generated bitcode file, treated as one module by the optimizer, will have two compile_units and three DIFile nodes.
I hope this helps.

You can just create DISubprogram and set isArtificial flag to indicate that this is a compiler generated function. You can pick your main source file as the file and 0 for the line number here.

the location where it was expanded

OK here’s another question along these lines: According to the LLVM source level debugging manual:

The first member of subroutine (tag = DW_TAG_subroutine_type) type elements is the return type for the subroutine. The remaining elements are the formal arguments to the subroutine.

Now, when I read “formal arguments” I’m assuming we’re talking about DIEs of type DW_TAG_formal_parameter. However, when I look in the code in CGDebugInfo.cpp in clang, I see that the arguments are in fact the bare types, not the formal parameter declarations.

Here’s what my code looks like:

const ParameterList & params = type->params();

for (ParameterList::const_iterator it = params.begin(); it != params.end(); ++it) {

const ParameterDefn * param = *it;

DIType ptype = genDIParameterType(param->type());

ptype = dbgFactory_.CreateDerivedTypeEx(

dwarf::DW_TAG_formal_parameter,

dbgCompileUnit_,

param->name() != NULL ? param->name() : “”,

genDIFile(param),

getSourceLineNumber(param->location()),

getSizeOfInBits(param->internalType()->irParameterType()),

getInt64Val(0),

getInt64Val(0), 0,

ptype);

DASSERT(ptype.Verify());

args.push_back(ptype);

}

However, if I go by what’s in clang, it seems that the DW_TAG_formal_parameter is unnecessary. Is this correct?

And I’d still like to see some of these questions addressed in the actual HTML documentation, as opposed to just responding here on the mailing list. :slight_smile: