Question about debug information for global variables

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

The code for creating logical variables looks roughly like:

    llvm::SmallVector<uint64_t, 4> ops;
    size_t offset = getOffset(logicalVariable);
    // Get the value of the global variable that contains a pointer to
the memory block.
    // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.
    ops.push_back(llvm::dwarf::DW_OP_deref);
    // Add a constant offset to the value of the global variable.
    ops.push_back(llvm::dwarf::DW_OP_constu);
    ops.push_back(offset);
    ops.push_back(llvm::dwarf::DW_OP_plus);
    llvm::DIExpression *DIexpr{nullptr};
    auto *DIE = DIBuilder_->createExpression(ops);
    auto *GVE = DIBuilder_->createGlobalVariableExpression(
        cu, name, "", file, 0, type,
        /* isLocalToUnit */ false, DIE);
    // Add GVE as debug info to BaseAddress.
    baseAddress->addDebugInfo(GVE);

I guess I'm doing something wrong, but I cannot see what it is.

Interestingly enough, when I tried a similar approach, where
BaseAddress would be a function argument and logical variables would
be created by means of insertDeclare, everything seems to work just
fine under LLDB. I use the following code for this approach:

   llvm::SmallVector<uint64_t, 4> ops;
   llvm::DIExpression::appendOffset(ops, getOffset(logicalVariable));
   auto *DIE = DIBuilder_->createExpression(ops);
   builder.insertDeclare(baseAddress, var, DIE,
                                llvm::DebugLoc::get(lineNo, 0, currentScope),
                                insertionPoint);

Can anyone provide an insight regarding how to achieve the desired
results using global variables and what I'm doing wrong? Or may be
there are simpler ways to achieve the same effect?

Thanks,
  Roman

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

Can you share the final expression as printed by llvm-dwarfdump?

The code for creating logical variables looks roughly like:

   llvm::SmallVector<uint64_t, 4> ops;
   size_t offset = getOffset(logicalVariable);
   // Get the value of the global variable that contains a pointer to
the memory block.
   // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.

Have you considered that this might be a bug in LLDB? You really shouldn't get the same result with and without the DW_OP_deref (unless you are pointing to a self-referential pointer :slight_smile:

-- adrian

Adrian,

Thanks for a quick reply!

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

Can you share the final expression as printed by llvm-dwarfdump?

This is what I see printed by llvm-dwarfdump using the command-line:
llvm-dwarfdump -verify-debug-info -debug-dump=all myobjectfile.o

The DW_AT_location values are different:

With deref:
0x00000032: DW_TAG_variable [2]
                DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
                DW_AT_type [DW_FORM_ref4] (cu + 0x0049 => {0x00000049})
                DW_AT_external [DW_FORM_flag_present] (true)
                DW_AT_location [DW_FORM_exprloc] (<0xd> 03 b0
2e 00 00 00 00 00 00 06 10 00 22 )

Without deref:
0x00000032: DW_TAG_variable [2]
                DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
                DW_AT_type [DW_FORM_ref4] (cu + 0x0048 => {0x00000048})
                DW_AT_external [DW_FORM_flag_present] (true)
                DW_AT_location [DW_FORM_exprloc] (<0xc> 03 a0
2e 00 00 00 00 00 00 10 00 22 )

Unfortunately, the llvm-dwarfdump output is not very descriptive when
it comes to expressions.

In LLVM IR, it is more understandable:

With deref:
!28 = !DIExpression(DW_OP_deref, DW_OP_constu, 2000, DW_OP_plus)

Without deref:
!28 = !DIExpression(DW_OP_constu, 2000, DW_OP_plus)

The code for creating logical variables looks roughly like:

   llvm::SmallVector<uint64_t, 4> ops;
   size_t offset = getOffset(logicalVariable);
   // Get the value of the global variable that contains a pointer to
the memory block.
   // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.

Have you considered that this might be a bug in LLDB? You really shouldn't get the same result with and without the DW_OP_deref (unless you are pointing to a self-referential pointer :slight_smile:

Yes, I was wondering if it could be an LLDB bug, because I'd also
expect different results.

That being said, does the overall approach I outlined seem correct?
I.e. should it be possible in principle to express the logical global
variables using the approach I outlined?

-Roman

Adrian,

Thanks for a quick reply!

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

Can you share the final expression as printed by llvm-dwarfdump?

This is what I see printed by llvm-dwarfdump using the command-line:
llvm-dwarfdump -verify-debug-info -debug-dump=all myobjectfile.o

The DW_AT_location values are different:

With deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0049 => {0x00000049})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xd> 03 b0
2e 00 00 00 00 00 00 06 10 00 22 )

Without deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0048 => {0x00000048})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xc> 03 a0
2e 00 00 00 00 00 00 10 00 22 )

Unfortunately, the llvm-dwarfdump output is not very descriptive when
it comes to expressions.

You might want to use one built from current llvm.org trunk, it can now. Actually, would you mind doing that? I'd be curious as to what the first couple of bytes decode to.

In LLVM IR, it is more understandable:

With deref:
!28 = !DIExpression(DW_OP_deref, DW_OP_constu, 2000, DW_OP_plus)

Without deref:
!28 = !DIExpression(DW_OP_constu, 2000, DW_OP_plus)

The code for creating logical variables looks roughly like:

  llvm::SmallVector<uint64_t, 4> ops;
  size_t offset = getOffset(logicalVariable);
  // Get the value of the global variable that contains a pointer to
the memory block.
  // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.

Have you considered that this might be a bug in LLDB? You really shouldn't get the same result with and without the DW_OP_deref (unless you are pointing to a self-referential pointer :slight_smile:

Yes, I was wondering if it could be an LLDB bug, because I'd also
expect different results.

That being said, does the overall approach I outlined seem correct?
I.e. should it be possible in principle to express the logical global
variables using the approach I outlined?

The general approach seems fine to me. It should be fairly straightforward to debug this in LLDB's DWARF expression handler.

-- adrian

Adrian,

Thanks for a quick reply!

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

Can you share the final expression as printed by llvm-dwarfdump?

This is what I see printed by llvm-dwarfdump using the command-line:
llvm-dwarfdump -verify-debug-info -debug-dump=all myobjectfile.o

The DW_AT_location values are different:

With deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0049 => {0x00000049})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xd> 03 b0
2e 00 00 00 00 00 00 06 10 00 22 )

Without deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0048 => {0x00000048})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xc> 03 a0
2e 00 00 00 00 00 00 10 00 22 )

Unfortunately, the llvm-dwarfdump output is not very descriptive when
it comes to expressions.

You might want to use one built from current llvm.org trunk, it can now. Actually, would you mind doing that? I'd be curious as to what the first couple of bytes decode to.

With deref:
0x00000032: DW_TAG_variable
                DW_AT_name ("var1")
                DW_AT_type (cu + 0x0049 "int")
                DW_AT_external (true)
                DW_AT_location (DW_OP_addr 0x2eb0, DW_OP_deref,
DW_OP_constu 0x0, DW_OP_plus)

Without deref:
0x00000032: DW_TAG_variable
                DW_AT_name ("var1")
                DW_AT_type (cu + 0x0048 "int")
                DW_AT_external (true)
                DW_AT_location (DW_OP_addr 0x2ea0, DW_OP_constu 0x0,
DW_OP_plus)

In LLVM IR, it is more understandable:

With deref:
!28 = !DIExpression(DW_OP_deref, DW_OP_constu, 2000, DW_OP_plus)

Without deref:
!28 = !DIExpression(DW_OP_constu, 2000, DW_OP_plus)

The code for creating logical variables looks roughly like:

  llvm::SmallVector<uint64_t, 4> ops;
  size_t offset = getOffset(logicalVariable);
  // Get the value of the global variable that contains a pointer to
the memory block.
  // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.

Have you considered that this might be a bug in LLDB? You really shouldn't get the same result with and without the DW_OP_deref (unless you are pointing to a self-referential pointer :slight_smile:

Yes, I was wondering if it could be an LLDB bug, because I'd also
expect different results.

That being said, does the overall approach I outlined seem correct?
I.e. should it be possible in principle to express the logical global
variables using the approach I outlined?

The general approach seems fine to me. It should be fairly straightforward to debug this in LLDB's DWARF expression handler.

I just tried with GDB. And GDB works fine with the DW_OP_deref
version. So, it is definitely a bug in LLDB.

-Roman

Can you report the bug with a test case? Possibly reduced?

Thanks,

Adrian,

Thanks for a quick reply!

Hi,

I'm trying to achieve the following:

- I have a global variable BaseAddress that holds the base address of
a contiguous dynamically allocated memory block.

- I have a number of logical variables of different types that are
mapped on certain address ranges inside the memory block pointed to by
BaseAddress. The offset and the size of each such logical variable
inside the memory block are known constants.

- I'd like to make it possible to access these logical variables
inside the debugger as if they are normal global variables.

My idea was to create the debug information for each of these logical
variables by using DIBuilder::createGlobalVariableExpression called
GVE and provide a DIExpression called DIE that should basically take
the value of the global variable GVE is added to, i.e. the value of
BaseAddress, and add a constant offset corresponding to the logical
variable. This should be the address of the logical variable.

So, the DIExpression DIE would look something like:
DW_OP_deref, DW_OP_constu, offset, DW_OP_plus

But this does not work. I tried the variants with and without
DW_OP_deref, but I always get the same wrong result when I test with
LLDB. The offset is always added to the address of BaseAddress and not
to its value.

Can you share the final expression as printed by llvm-dwarfdump?

This is what I see printed by llvm-dwarfdump using the command-line:
llvm-dwarfdump -verify-debug-info -debug-dump=all myobjectfile.o

The DW_AT_location values are different:

With deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0049 => {0x00000049})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xd> 03 b0
2e 00 00 00 00 00 00 06 10 00 22 )

Without deref:
0x00000032: DW_TAG_variable [2]
               DW_AT_name [DW_FORM_strp] (
.debug_str[0x0000001d] = "var1")
               DW_AT_type [DW_FORM_ref4] (cu + 0x0048 => {0x00000048})
               DW_AT_external [DW_FORM_flag_present] (true)
               DW_AT_location [DW_FORM_exprloc] (<0xc> 03 a0
2e 00 00 00 00 00 00 10 00 22 )

Unfortunately, the llvm-dwarfdump output is not very descriptive when
it comes to expressions.

You might want to use one built from current llvm.org trunk, it can now. Actually, would you mind doing that? I'd be curious as to what the first couple of bytes decode to.

With deref:
0x00000032: DW_TAG_variable
                DW_AT_name ("var1")
                DW_AT_type (cu + 0x0049 "int")
                DW_AT_external (true)
                DW_AT_location (DW_OP_addr 0x2eb0, DW_OP_deref,
DW_OP_constu 0x0, DW_OP_plus)

Without deref:
0x00000032: DW_TAG_variable
                DW_AT_name ("var1")
                DW_AT_type (cu + 0x0048 "int")
                DW_AT_external (true)
                DW_AT_location (DW_OP_addr 0x2ea0, DW_OP_constu 0x0,
DW_OP_plus)

In LLVM IR, it is more understandable:

With deref:
!28 = !DIExpression(DW_OP_deref, DW_OP_constu, 2000, DW_OP_plus)

Without deref:
!28 = !DIExpression(DW_OP_constu, 2000, DW_OP_plus)

The code for creating logical variables looks roughly like:

  llvm::SmallVector<uint64_t, 4> ops;
  size_t offset = getOffset(logicalVariable);
  // Get the value of the global variable that contains a pointer to
the memory block.
  // NOTE: Even if DW_OP_deref is omitted, the results under LLDB
are the same.

Have you considered that this might be a bug in LLDB? You really shouldn't get the same result with and without the DW_OP_deref (unless you are pointing to a self-referential pointer :slight_smile:

Yes, I was wondering if it could be an LLDB bug, because I'd also
expect different results.

That being said, does the overall approach I outlined seem correct?
I.e. should it be possible in principle to express the logical global
variables using the approach I outlined?

The general approach seems fine to me. It should be fairly straightforward to debug this in LLDB's DWARF expression handler.

I just tried with GDB. And GDB works fine with the DW_OP_deref
version. So, it is definitely a bug in LLDB.

Can you report the bug with a test case? Possibly reduced?

Sure. 36871 – Wrong handling of DW_OP_deref with global variables

-Roman