Question WRT llvm.dbg.value

Hello Everyone,

I have general question WRT llvm.dbg.value intrinsic function semantics.

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit llvm.dbg.value for every load operation happening on a local variable. And as noted below in IR snippet it has already emitted llvm.dbg.declare for the local variable.

IR snippet of a subprogram from flang -

Hi Sourabh,

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit *llvm.dbg.value* for *every load operation* happening on a *local variable*. And as noted below in IR snippet it has already emitted *llvm.dbg.declare* for the local variable.

[...]

call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata !DIExpression()), !dbg !11
%0 = load i32, i32* %foo, align 4, !dbg !13
  call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata !DIExpression()), !dbg !11

My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

If dbg.value intrinsics are to be used and the variable can be located
in memory too, then the producer needs to specify where the location
switches from a value to an address (and vice versa) with dbg.value /
dbg.addr. Awkwardly,I think there are some issues with dbg.addr at -O0
that Brian ran into here [0, 1], which might need addressing.

[0] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139500.html
[1] http://lists.llvm.org/pipermail/llvm-dev/2020-February/139511.html

Hi!

Hi Sourabh,

> Under what circumstances should a frontend choose to emit(at -O0(No
> optimization)) llvm.dbg.value for a local variable.
>
> I saw some debuginfo code in flang(older one), sort of it choose to emit
> *llvm.dbg.value* for *every load operation* happening on a *local variable*.
> And as noted below in IR snippet it has already emitted *llvm.dbg.declare*
> for the local variable.

[...]

> call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata
> !DIExpression()), !dbg !11
> %0 = load i32, i32* %foo, align 4, !dbg !13
> call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata
> !DIExpression()), !dbg !11

My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

If dbg.value intrinsics are to be used and the variable can be located
in memory too, then the producer needs to specify where the location
switches from a value to an address (and vice versa) with dbg.value /
dbg.addr. Awkwardly,I think there are some issues with dbg.addr at -O0
that Brian ran into here [0, 1], which might need addressing.

There are also some issues with how SelectionDAG place the resulting DBG_VALUE
instructions for dbg.addr: https://bugs.llvm.org/show_bug.cgi?id=35318

Best regards,
David

Hi Sourabh,

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit llvm.dbg.value for every load operation happening on a local variable. And as noted below in IR snippet it has already emitted llvm.dbg.declare for the local variable.

[…]

call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata !DIExpression()), !dbg !11
%0 = load i32, i32* %foo, align 4, !dbg !13
call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata !DIExpression()), !dbg !11

My understanding is that this isn’t correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

Correct, you should not be mixing dbg.declare and other instrinsics for the same variable.

See also https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-declare

– adrian


















> > My understanding is that this isn’t correct: dbg.declare specifies the
> memory address of a variable for the whole lifetime of the function,
> whereas dbg.value (and dbg.addr) specify the value/address until the
> next debug intrinsic. Mixing these two kinds of intrinsics creates
> ambiguity over where the variable location is at different positions
> in the code.



> Correct, you should not be mixing dbg.declare and other instrinsics for the same variable


How about patching up llvm for the same, currently the IR showed above is valid and can be processed by llvm for target code generation.
Should we move ahead invalidate this behavior as in “declare and value intrinsic can’t be specified for same local variable”. ?


So that no FE should generate this sort of thing in first place. clang doesn’t do that so this change should not affect clang.


Thanks,
Sourabh.














|

Hi!

> My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

          > Correct, you should not be mixing dbg.declare and other
instrinsics for the same variable

How about patching up llvm for the same, currently the IR showed above is
valid and can be processed by llvm for target code generation.
Should we move ahead invalidate this behavior as in "declare and value
intrinsic can't be specified for same local variable". ?

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

Here is a ticket for that: https://bugs.llvm.org/show_bug.cgi?id=39314

In that case LLVM incorrectly emits a single location description using the
information from the declares, and ignores the values, which would have produced
a location list.

Best regards,
David

Adding this to the Verifier sounds like a good idea to me. It may be possible that this uncovers existing bugs in the current flow, but that would be a good thing.

-- adrian

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

I had a quick look on that PR(thanks for sharing). A verifier(if implemented) at LLVM level would invalidate this too. Would that be good ?
That brings up one other question, After SROA(like in present case) there can be mix of dbg.decalre and dbg.value of the same variable left out.
Snippet from PR
[.]

call void @llvm.dbg.declare(metadata i64* %arr.sroa.0, metadata !15, metadata  DIExpression(DW_OP_LLVM_fragment, 0, 64)), !dbg !23

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

I had a quick look on that PR(thanks for sharing). A verifier(if implemented) at LLVM level would invalidate this too. Would that be good ?
That brings up one other question, After SROA(like in present case) there can be mix of dbg.decalre and dbg.value of the same variable left out.
Snippet from PR
[.]

call void @llvm.dbg.declare(metadata i64* %arr.sroa.0, metadata !15, metadata  DIExpression(DW_OP_LLVM_fragment, 0, 64)), !dbg !23

store i64 0, i64* %arr.sroa.0, align 16, !dbg !23
 call void @llvm.dbg.value(metadata i64 0, metadata !15, metadata !DIExpression(DW_OP_LLVM_fragment, 64, 64)), !dbg !23

[.]


```
Does this presence of *dbg.value* suggest that previous location/value described by *dbg.declare* is invalidated from this point. ? Not I suppose since Docs mentions *it's valid for entire lifetime*. Then why mix the at first place ?
```

The documentation describes our intention and reality often lags behind. I think that we should fix the code to convert the dbg.declare into a dbg.value (or dbg.addr?) that points into the alloca as well. I’m guessing that it will not be easy to do this (we can call Local.cpp’s LowerDbgDeclare functionality beforehand, but that may actually make SROA’s job harder because it needs to find and update more values that may be difficult to associate in reverse…), but having the verifier should help in the process.

– adrian

This post is in continuation of checking the validity of IR(debug-info) emitted by different FE’s.

flang FE at -O0(No Opt) level sometime emits – DILocalVariable without name as in a sense –

[…]
!15 = !DILocalVariable(scope: !5, file: !3, type: !8, flags: DIFlagArtificial) — Valid IR, since name(field) is optional while parsing and validation.

Which eventually turned out in dwarf as –
0x0000006d: DW_TAG_variable ---------- Is this information useful for debugger ? Variable doesn’t have any name.
DW_AT_location (DW_OP_fbreg -12)
DW_AT_type (0x00000093 “integer”)
DW_AT_artificial (0x01)

[…]

While looking for answer did some digging into clang/llvm , initially seems like it doesn’t —
Snippet from DebugInfoMetadata.cpp
[…]
DILocalVariable *DILocalVariable::getImpl(LLVMContext &Context, Metadata *Scope,

assert(isCanonical(Name) && “Expected canonical MDString”);

[…]

Is this functionality desired/needed ? To have verifier validate anonymous DILocalVariable.

Is this anonymous DILocalVariable and subsequent anonymous DW_TAG_variable dwarf has some other use, that I’m missing out at this point.

Can anybody provide some clarity WRT this.

Thank You!
Sourabh.

This post is in continuation of checking the validity of IR(debug-info) emitted by different FE’s.

flang FE at -O0(No Opt) level sometime emits – DILocalVariable without name as in a sense –

[…]
!15 = !DILocalVariable(scope: !5, file: !3, type: !8, flags: DIFlagArtificial) — Valid IR, since name(field) is optional while parsing and validation.

Which eventually turned out in dwarf as –
0x0000006d: DW_TAG_variable ---------- Is this information useful for debugger ? Variable doesn’t have any name.
DW_AT_location (DW_OP_fbreg -12)
DW_AT_type (0x00000093 “integer”)
DW_AT_artificial (0x01)

[…]

Since the variable is also marked as artificial, I am guessing that it might be a helper variable, for example, one the is used to specify dynamic array bounds in an array type. Clang does something similar for variable length arrays — though Clang gives these variables some generic name.

While looking for answer did some digging into clang/llvm , initially seems like it doesn’t —
Snippet from DebugInfoMetadata.cpp
[…]
DILocalVariable *DILocalVariable::getImpl(LLVMContext &Context, Metadata *Scope,

assert(isCanonical(Name) && “Expected canonical MDString”);

[…]

Is this functionality desired/needed ? To have verifier validate anonymous DILocalVariable.

Is this anonymous DILocalVariable and subsequent anonymous DW_TAG_variable dwarf has some other use, that I’m missing out at this point.

Can anybody provide some clarity WRT this.

If this is indeed a helper variable referenced from a type, I think this is behavior should be allowed.

– adrian