debug metadata incomplete for array arguments to functions?

Hello,

Consider the following two functions:

void foo(int* arg_ptr) {

}

void bar(int arg_arr[42]) {

}

The only way out is to figure out how to distinguish between these two in front-end AST nodes while emitting LLVM IR. The part of front end that emits debug info for an argument is seeing arg_arr as a pointer to int.

If you manually patch metadata in llvm IR then you'll get debug info as you expect.

This reflects the compiler’s view of things correctly, but is problematic for a debugger. The debugger should know that arg_arr refers to a 42-element array and isn’t just a pointer into a buffer of unspecified length. This is something the user would expect.
On the other hand, the debugger should also get the information that arg_arr is actually a pointer, to be able to get to its values correctly.

Is there a way out?

The only way out is to figure out how to distinguish between these two in front-end AST nodes while emitting LLVM IR. The part of front end that emits debug info for an argument is seeing arg_arr as a pointer to int.

If you manually patch metadata in llvm IR then you’ll get debug info as you expect.

Suppose one would start writing a patch to Clang to rectify this, how would this information be encoded in the debug metadata, given the dual nature of the arg_arr argument? Is there a mechanism to support it, or is an extension required?

Thanks in advance
Eli

Hi Eli,

The first thing is to make sure it actually makes any difference. I'd
get an IR that is valid and can be debugged (but prints the wrong
declaration) and start by changing manually the metadata in the IR
until it achieves what you want (to work AND print the correct
declaration). Only then I'd start changing Clang...

Dwarf is too generic and debuggers' support is too ad-hoc for one to
say "correct implementation". I wouldn't assume anything before seeing
it working on a number of debuggers. Dwarf can be syntactically
correct and not be understood by some debuggers, but it can't ever be
syntactically incorrect, which I gather from your comment that it's
not in your case.

You can also see what other tool-chains generate from your example. It
may be the quickest path to get the right answer.

cheers,
--renato

Renato, I’m not sure I understand what you mean.
IIUC, debug metadata in IR is the source from which DWARF is eventually generated. If the debug metatada format doesn’t support such a construct (providing an alternative “type view” for a pointer), there isn’t much that can be done to get this code-gened into DWARF. So I asked Devang if to his knowledge such a mechanism exists in the current debug metadata format, or could be implemented with existing fields.

If yes, all I need is to improve debug metadata generation from clang for this case.

If not, then perhaps an extension to the debug metadata format is required to support this.

Do you imply that DWARF doesn’t support this construct, so the “cause is lost”?

Thanks,
Eli

Absolutely not!

I was just giving you some hints on how to investigate *how* this
construct needs to be represented in IR/Dwarf.

Using other tool-chains to produce Dwarf, that is both correct and
reflects the code as is, will help you trace back how the IR looks
like.

I was also suggesting, as Devang also said, that you start by changing
the IR by hand, before try more adventurous changes in Clang.

*If* it's not possible to represent that in IR (I have no idea), then
LLVM needs to be changed first.

Only when you're sure that there is a clear representation of your
problem in IR that you can go on changing Clang.

cheers,
--renato

Aha… dual mode. I am not sure there is a straight forward way in dwarf to explicitly say this type is int[42]

but treat it is a int*.

One possible approach is to use array type in subprogram declaration and use int * in subprogram definition
and teach debugger to understand what you are trying to say.

That's the hard part.

--renato

Can't you just encode it as if it were a reference to a static array?
In other words, generate debug info for
  void bar(int arg_arr[42]) {
  ...
  }
as if it said
  void bar(int (&arg_arr)[42]) {
  ...
  }
instead.

A reference should be equivalent to a pointer for codegen purposes,
and this should tell the debugger that the parameter is actually an
array according to the source code.

I don't know whether this might lead to the debugger showing the
implicit '&' when printing out the parameter though, but I don't think
that would be a big problem.

That would lead to &arg_arr having the wrong type and (furthermore)
would prevent users from calling the function with an array not of size 42.
You don't want to go down this road.

The interesting question here is whether there's any way to represent
the "written" type in DWARF. If not, we could certainly introduce an
extension, and then hack various debuggers to support it, but I'm not
sure this problem really merits that.

John.