Question WRT llvm.dbg.value

Sourabh_Singh_Tomar · March 30, 2020, 7:08am

Hello Everyone,

I have general question WRT llvm.dbg.value intrinsic function semantics.

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit llvm.dbg.value for every load operation happening on a local variable. And as noted below in IR snippet it has already emitted llvm.dbg.declare for the local variable.

IR snippet of a subprogram from flang -

Jeremy_Morse · March 30, 2020, 11:13am

Hi Sourabh,

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit *llvm.dbg.value* for *every load operation* happening on a *local variable*. And as noted below in IR snippet it has already emitted *llvm.dbg.declare* for the local variable.

[...]

call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata !DIExpression()), !dbg !11
%0 = load i32, i32* %foo, align 4, !dbg !13
call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata !DIExpression()), !dbg !11

My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

If dbg.value intrinsics are to be used and the variable can be located
in memory too, then the producer needs to specify where the location
switches from a value to an address (and vice versa) with dbg.value /
dbg.addr. Awkwardly,I think there are some issues with dbg.addr at -O0
that Brian ran into here [0, 1], which might need addressing.

[0] [llvm-dev] Why is lldb telling me "variable not available"?
[1] [llvm-dev] Why is lldb telling me "variable not available"?

David_Stenberg · March 30, 2020, 2:38pm

Hi!

Hi Sourabh,

> Under what circumstances should a frontend choose to emit(at -O0(No
> optimization)) llvm.dbg.value for a local variable.
>
> I saw some debuginfo code in flang(older one), sort of it choose to emit
> *llvm.dbg.value* for *every load operation* happening on a *local variable*.
> And as noted below in IR snippet it has already emitted *llvm.dbg.declare*
> for the local variable.

[...]

> call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata
> !DIExpression()), !dbg !11
> %0 = load i32, i32* %foo, align 4, !dbg !13
> call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata
> !DIExpression()), !dbg !11

My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

If dbg.value intrinsics are to be used and the variable can be located
in memory too, then the producer needs to specify where the location
switches from a value to an address (and vice versa) with dbg.value /
dbg.addr. Awkwardly,I think there are some issues with dbg.addr at -O0
that Brian ran into here [0, 1], which might need addressing.

There are also some issues with how SelectionDAG place the resulting DBG_VALUE
instructions for dbg.addr: 35318 – dbg.addr location lists are wrong

Best regards,
David

adrian.prantl · March 30, 2020, 8:54pm

Hi Sourabh,

Under what circumstances should a frontend choose to emit(at -O0(No optimization)) llvm.dbg.value for a local variable.

I saw some debuginfo code in flang(older one), sort of it choose to emit llvm.dbg.value for every load operation happening on a local variable. And as noted below in IR snippet it has already emitted llvm.dbg.declare for the local variable.

[…]

call void @llvm.dbg.declare(metadata i32* %foo, metadata !9, metadata !DIExpression()), !dbg !11
%0 = load i32, i32* %foo, align 4, !dbg !13
call void @llvm.dbg.value(metadata i32 %0, metadata !9, metadata !DIExpression()), !dbg !11

My understanding is that this isn’t correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

Correct, you should not be mixing dbg.declare and other instrinsics for the same variable.

See also https://llvm.org/docs/SourceLevelDebugging.html#llvm-dbg-declare

– adrian

Sourabh_Singh_Tomar · March 31, 2020, 6:57am

> > My understanding is that this isn’t correct: dbg.declare specifies the
> memory address of a variable for the whole lifetime of the function,
> whereas dbg.value (and dbg.addr) specify the value/address until the
> next debug intrinsic. Mixing these two kinds of intrinsics creates
> ambiguity over where the variable location is at different positions
> in the code.

> Correct, you should not be mixing dbg.declare and other instrinsics for the same variable

How about patching up llvm for the same, currently the IR showed above is valid and can be processed by llvm for target code generation.
Should we move ahead invalidate this behavior as in “declare and value intrinsic can’t be specified for same local variable”. ?

So that no FE should generate this sort of thing in first place. clang doesn’t do that so this change should not affect clang.

Thanks,
Sourabh.

|

David_Stenberg · March 31, 2020, 8:58am

Hi!

> My understanding is that this isn't correct: dbg.declare specifies the
memory address of a variable for the whole lifetime of the function,
whereas dbg.value (and dbg.addr) specify the value/address until the
next debug intrinsic. Mixing these two kinds of intrinsics creates
ambiguity over where the variable location is at different positions
in the code.

> Correct, you should not be mixing dbg.declare and other
instrinsics for the same variable

How about patching up llvm for the same, currently the IR showed above is
valid and can be processed by llvm for target code generation.
Should we move ahead invalidate this behavior as in "declare and value
intrinsic can't be specified for same local variable". ?

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

Here is a ticket for that: 39314 – [DebugInfo] Location list for variable is missing due to mix of dbg.declare and dbg.value

In that case LLVM incorrectly emits a single location description using the
information from the declares, and ignores the values, which would have produced
a location list.

Best regards,
David

adrian.prantl · March 31, 2020, 3:03pm

Adding this to the Verifier sounds like a good idea to me. It may be possible that this uncovers existing bugs in the current flow, but that would be a good thing.

-- adrian

Sourabh_Singh_Tomar · April 1, 2020, 9:56am

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

I had a quick look on that PR(thanks for sharing). A verifier(if implemented) at LLVM level would invalidate this too. Would that be good ?
That brings up one other question, After SROA(like in present case) there can be mix of dbg.decalre and dbg.value of the same variable left out.
Snippet from PR
[.]

call void @llvm.dbg.declare(metadata i64* %arr.sroa.0, metadata !15, metadata  DIExpression(DW_OP_LLVM_fragment, 0, 64)), !dbg !23

adrian.prantl · April 1, 2020, 4:49pm

Do you mean documenting the desired frontend behavior, or adding some verifier in
LLVM? A warning for the latter is that SROA may currently emit IR that contains a
mix of declares and values for different fragments of an aggregate variable, so I
assume that is something that would need to be fixed then.

I had a quick look on that PR(thanks for sharing). A verifier(if implemented) at LLVM level would invalidate this too. Would that be good ?
That brings up one other question, After SROA(like in present case) there can be mix of dbg.decalre and dbg.value of the same variable left out.
Snippet from PR
[.]
call void @llvm.dbg.declare(metadata i64* %arr.sroa.0, metadata !15, metadata  DIExpression(DW_OP_LLVM_fragment, 0, 64)), !dbg !23
store i64 0, i64* %arr.sroa.0, align 16, !dbg !23
 call void @llvm.dbg.value(metadata i64 0, metadata !15, metadata !DIExpression(DW_OP_LLVM_fragment, 64, 64)), !dbg !23
[.]
```
Does this presence of *dbg.value* suggest that previous location/value described by *dbg.declare* is invalidated from this point. ? Not I suppose since Docs mentions *it's valid for entire lifetime*. Then why mix the at first place ?
```

The documentation describes our intention and reality often lags behind. I think that we should fix the code to convert the dbg.declare into a dbg.value (or dbg.addr?) that points into the alloca as well. I’m guessing that it will not be easy to do this (we can call Local.cpp’s LowerDbgDeclare functionality beforehand, but that may actually make SROA’s job harder because it needs to find and update more values that may be difficult to associate in reverse…), but having the verifier should help in the process.

– adrian

Sourabh_Singh_Tomar · April 6, 2020, 9:52am

This post is in continuation of checking the validity of IR(debug-info) emitted by different FE’s.

flang FE at -O0(No Opt) level sometime emits – DILocalVariable without name as in a sense –

[…]
!15 = !DILocalVariable(scope: !5, file: !3, type: !8, flags: DIFlagArtificial) — Valid IR, since name(field) is optional while parsing and validation.

Which eventually turned out in dwarf as –
0x0000006d: DW_TAG_variable ---------- Is this information useful for debugger ? Variable doesn’t have any name.
DW_AT_location (DW_OP_fbreg -12)
DW_AT_type (0x00000093 “integer”)
DW_AT_artificial (0x01)

[…]

While looking for answer did some digging into clang/llvm , initially seems like it doesn’t —
Snippet from DebugInfoMetadata.cpp
[…]
DILocalVariable *DILocalVariable::getImpl(LLVMContext &Context, Metadata *Scope,
…
assert(isCanonical(Name) && “Expected canonical MDString”);

[…]

Is this functionality desired/needed ? To have verifier validate anonymous DILocalVariable.

Is this anonymous DILocalVariable and subsequent anonymous DW_TAG_variable dwarf has some other use, that I’m missing out at this point.

Can anybody provide some clarity WRT this.

Thank You!
Sourabh.

adrian.prantl · April 6, 2020, 4:14pm

This post is in continuation of checking the validity of IR(debug-info) emitted by different FE’s.

flang FE at -O0(No Opt) level sometime emits – DILocalVariable without name as in a sense –

[…]
!15 = !DILocalVariable(scope: !5, file: !3, type: !8, flags: DIFlagArtificial) — Valid IR, since name(field) is optional while parsing and validation.

Which eventually turned out in dwarf as –
0x0000006d: DW_TAG_variable ---------- Is this information useful for debugger ? Variable doesn’t have any name.
DW_AT_location (DW_OP_fbreg -12)
DW_AT_type (0x00000093 “integer”)
DW_AT_artificial (0x01)

[…]

Since the variable is also marked as artificial, I am guessing that it might be a helper variable, for example, one the is used to specify dynamic array bounds in an array type. Clang does something similar for variable length arrays — though Clang gives these variables some generic name.

While looking for answer did some digging into clang/llvm , initially seems like it doesn’t —
Snippet from DebugInfoMetadata.cpp
[…]
DILocalVariable *DILocalVariable::getImpl(LLVMContext &Context, Metadata *Scope,
…
assert(isCanonical(Name) && “Expected canonical MDString”);

[…]

Is this functionality desired/needed ? To have verifier validate anonymous DILocalVariable.

Is this anonymous DILocalVariable and subsequent anonymous DW_TAG_variable dwarf has some other use, that I’m missing out at this point.

Can anybody provide some clarity WRT this.

If this is indeed a helper variable referenced from a type, I think this is behavior should be allowed.

– adrian

Topic		Replies	Views
RFC: Unify debug and optimized variable locations with llvm.dbg.addr [was: DW_OP_LLVM_memory] LLVM Dev List Archives	10	248	September 11, 2017
What are the 'rules' to play nice with lldb-9? LLDB	2	127	March 3, 2020
RFC: Introduce DW_OP_LLVM_memory to describe variables in memory with dbg.value LLVM Dev List Archives	20	216	September 7, 2017
[RFC] DebugInfo: A different way of specifying variable locations post-isel LLVM Dev List Archives	8	200	May 27, 2020
Possible to avoid llvm.dbg.value intrinsics? Beginners debuginfo , clang , llvm	5	236	October 26, 2023

Question WRT llvm.dbg.value

Related topics