[debug-info] Stack pointer based variable locations

Hello llvm-dev,

I’ve noticed some behaviour I found surprising with the way that we emit stack pointer relative

variable locations. It seems that locations defined by DBG_VALUEs that are written in terms of RSP

(for x86) are terminated by any stack manipulation operations, e.g. pushing arguments before a

call. Since we know the stack offset at each adjustment it seems like we could maintain the variable

location by generating location list entries with adjusted RSP offsets.

Here’s a source reproducer with clang built at 71597d40e878 (recent), target

x86_64-unknown-linux-gnu).

$ cat test.cpp

void ext(int, int, int, int, int, int, int, int, int, int);

void escape(int*);

int example() {

int local = 0;

escape(&local);

ext(0, 1, 2, 3, 4, 5, 6, 7, 8, 9);

local += 2;

return local;

}

$ clang -O2 -g -c test.cpp -o test.o

$ llvm-objdump -d test.o

test.o: file format elf64-x86-64

Disassembly of section .text:

0000000000000000 <_Z7examplev>:

0: 50 pushq %rax

1: c7 44 24 04 00 00 00 00 movl $0, 4(%rsp)

9: 48 8d 7c 24 04 leaq 4(%rsp), %rdi

e: e8 00 00 00 00 callq 0x13 <_Z7examplev+0x13>

13: 31 ff xorl %edi, %edi

15: be 01 00 00 00 movl $1, %esi

1a: ba 02 00 00 00 movl $2, %edx

1f: b9 03 00 00 00 movl $3, %ecx

24: 41 b8 04 00 00 00 movl $4, %r8d

2a: 41 b9 05 00 00 00 movl $5, %r9d

30: 6a 09 pushq $9

32: 6a 08 pushq $8

34: 6a 07 pushq $7

36: 6a 06 pushq $6

38: e8 00 00 00 00 callq 0x3d <_Z7examplev+0x3d>

3d: 48 83 c4 20 addq $32, %rsp

41: 8b 44 24 04 movl 4(%rsp), %eax

45: 83 c0 02 addl $2, %eax

48: 59 popq %rcx

49: c3 retq

$ llvm-dwarfdump test.o --name local

test.o: file format elf64-x86-64

0x00000047: DW_TAG_variable

DW_AT_location (0x00000000:

[0x0000000000000001, 0x0000000000000009): DW_OP_consts +0, DW_OP_stack_value

[0x0000000000000009, 0x0000000000000032): DW_OP_breg7 RSP+4

[0x0000000000000045, 0x000000000000004a): DW_OP_reg0 RAX)

DW_AT_name (“local”)

DW_AT_decl_file ("/home/och/dev/bugs/scratch/test.cpp")

DW_AT_decl_line (4)

DW_AT_type (0x000000ad “int”)

The variable ‘local’ is not given a location over the interval [32, 45) even though we

know where it is (RSP+8, RSP+12, …, back to RSP+4 after the stack adjustment following the

call). It seems unfortunate to lose variable locations in this way, especially around call sites. Is

this a deliberate omission, perhaps made in order to save space? Jeremy mentioned that we do

something similar in prologues/epilogues to avoid generating large location lists.

Many thanks,

Orlando

In functions without a frame pointer, emitting a one-instruction location range at every push/pop (for a potentially large set of stack-homed local variables) does seem like it would take up a lot of space for not much real-world benefit.

On the other hand, having a location range that covers the actual call instruction (or I suppose, more precisely, its return address) would make the locals available to the user when the debugger is stopped in the callee, and that seems very valuable.

Wondering what other people think.

–paulr

Re; prologue: Locations aren't valid in the prologue, so for instance
we can give a simple location description of "stack offset 4" for a
parameter at -O0, despite the fact that the parameter isn't at stack
offset 4 until after we run the prologue that takes the ABI register
and stores it into that stack offset. That's handy - because otherwise
every parameter would have to use location lists at -O0, which would
be a lot of space to spend.

Looks like GCC manages to use DW_OP_fbreg for the example given, which
looks like it works/is correct despite the the pushes/pops (because
it's rbp, I guess - base pointer rather than stack pointer). Perhaps
we could do something like that too, in cases like this?

Hello,
If its really so, if i could, id vote for implementing full debuginfo as default and maybe left optimized path as optional. If that matters to anyone. It will help in tracking locals while debugging.

Best regards,
Pawel Kunio

czw., 6.05.2021, 19:51 użytkownik via llvm-dev <llvm-dev@lists.llvm.org> napisał:

Hi,

David said:

Re; prologue: Locations aren't valid in the prologue, so for instance
we can give a simple location description of "stack offset 4" for a
parameter at -O0, despite the fact that the parameter isn't at stack
offset 4 until after we run the prologue that takes the ABI register
and stores it into that stack offset. That's handy - because otherwise
every parameter would have to use location lists at -O0, which would
be a lot of space to spend.

That makes sense, thanks for the info.

David said:

Looks like GCC manages to use DW_OP_fbreg for the example given, which
looks like it works/is correct despite the the pushes/pops (because
it's rbp, I guess - base pointer rather than stack pointer). Perhaps
we could do something like that too, in cases like this?

RBP isn't used as a frame pointer in 'example' when building with gcc (7.5.0) or clang
(71597d40e878). It looks like gcc is able to use DW_OP_fbreg because it sets DW_AT_frame_base in the
parent DIE to DW_OP_call_frame_cfa. Is there anything stopping us from doing the same?

Many thanks,
Orlando

David said:

Orlando said:

David said:
> Looks like GCC manages to use DW_OP_fbreg for the example given, which
> looks like it works/is correct despite the the pushes/pops (because
> it's rbp, I guess - base pointer rather than stack pointer). Perhaps
> we could do something like that too, in cases like this?

RBP isn't used as a frame pointer in 'example' when building with gcc
(7.5.0) or clang
(71597d40e878). It looks like gcc is able to use DW_OP_fbreg because it
sets DW_AT_frame_base in the
parent DIE to DW_OP_call_frame_cfa. Is there anything stopping us from
doing the same?

If there is debug info, there is a .debug_frame/.eh_frame, so we should
be able to use that. The frame section already accounts for stack pointer
adjustments so it should Just Work. Give it a go!

Pawel said:

If its really so, if i could, id vote for implementing full debuginfo
as default and maybe left optimized path as optional. If that matters
to anyone. It will help in tracking locals while debugging.

Improving debug info for optimized code is an explicit goal, at least
for Sony, and I believe for others. Any situation where you have code
handling real-time (which includes video games, embedded software, and
others) it is frequently impractical to compile un-optimized and have
the code still meet its real-time requirements. Obviously once you
have hit a breakpoint, you have left the real-time world, but it's
commonly the case that you have to meet real-time requirements until
you get to that breakpoint.

--paulr

That sounds like the right solution. The variable location of a stack object should be based on something that does not change throughout the function.

LLVM’s behavior was probably less of an issue before we started doing call frame optimization for x86 (X86CallFrameOptimization), meaning converting stores to pushes. Before that, LLVM would mostly only emit SP adjustments when a frame pointer was in use, so the variable location would be based on RBP.

Thanks all, I've written up a ticket for this here https://llvm.org/PR50285. I plan on looking into this later in the week.

-Orlando