Line info for call expressions

Currently the following example:

1 {
2 foo(
3 bar(),
4 baz());
5 }

will get codegen’d into something like (in pseudo-IR):

%arg0 = call @bar, !dbg (line: 3)
%arg1 = call @baz, !dbg (line: 4)
call @foo(%arg0, %arg1), !dbg(line: 2)

which leads to a weird debugging experience. When developers set a breakpoint at line 2, they don’t generally expect line 3 and 4 to already have executed. It sounds like a solution would be to associate call expressions with the location of the closing parenthesis instead, so the breakpoint could fall through to line 3 and single-stepping behaves as expected.

The downside would be that this might break some existing debuggers’ stepping expectations.

What do you think?

-- adrian

This generalizes over any line table entry - currently we associate
instruction any operation with the preferred source location of the
instruction. This means something like:

x()

I really like the improvements in the lines tables and think they are really great. And from a debugger engineer perspective they make total sense to me.

I am more worried about what issues users will run into as they debug and wonder why things are how they are.

The bug that we ran into that spawned this conversation was code like this:

22: printf("var1 = %i, var2 = %i",
23: var1,
24: var2);

If you set a breakpoint on line 22 it will stop _after_ var1 and var2 have been loaded into registers for the function call to printf and if you say:

(lldb) expr var1 = 123
(lldB) next

And step over the printf, you will get the old "var1" value. This is partly because the const C string doesn't end up believing it has any code associated with it (the PC relative load of the C string into a register) and it gets attributed to the previous source line -- which seems like a bug we would love to see fixed BTW -- so the line table entries look like:

0x09f0: main.c:15 // code for the loading of "var1 = %i, var2 = %i" in incorrectly associated with the previous source line
0x1000: main.c:23 // var1 into register
0x1010: main.c:24 // var2 into register
0x1020: main.c:22 // call to printf

>
>
>
> Currently the following example:
>
> 1 {
> 2 foo(
> 3 bar(),
> 4 baz());
> 5 }
>
> will get codegen’d into something like (in pseudo-IR):
>
> %arg0 = call @bar, !dbg (line: 3)
> %arg1 = call @baz, !dbg (line: 4)
> call @foo(%arg0, %arg1), !dbg(line: 2)
>
> which leads to a weird debugging experience. When developers set a
breakpoint at line 2, they don’t generally expect line 3 and 4 to already
have executed. It sounds like a solution would be to associate call
expressions with the location of the closing parenthesis instead, so the
breakpoint could fall through to line 3 and single-stepping behaves as
expected.
>
> The downside would be that this might break some existing debuggers’
stepping expectations.
>
> What do you think?
>
> This generalizes over any line table entry - currently we associate
instruction any operation with the preferred source location of the
instruction. This means something like:
>
> x()
> +
> y()
>
> goes 1, 3, 2 - similar oddity for the user, but it's about the most
accurate thing we can do, if a bit surprising.
>
> Using the end is probably /fairly/ reliable, but also confusing:
>
> x
> +
> y
> +
> z
>
> if + is overloaded and you're in a backtrace which points to line 3 or 5
- you have to figure out the precedence to know which '+' operator call you
were actually in. By using the preferred location we get line 2 or 4 and
it's obvious which plus the program is executing.
>
> Chandler & I talked about this (offline, unfortunately) when I was
making major improvements to locations a few months ago & settled on
preferred location being the least ambiguous thing for users. Maybe there
are other perspectives we hadn't considered, though.

I really like the improvements in the lines tables and think they are
really great. And from a debugger engineer perspective they make total
sense to me.

I am more worried about what issues users will run into as they debug and
wonder why things are how they are.

The bug that we ran into that spawned this conversation was code like this:

22: printf("var1 = %i, var2 = %i",
23: var1,
24: var2);

If you set a breakpoint on line 22 it will stop _after_ var1 and var2 have
been loaded into registers for the function call to printf and if you say:

(lldb) expr var1 = 123
(lldB) next

And step over the printf, you will get the old "var1" value.

Yep - pretty much all options have oddities that're going to confuse users.
Though it's useful to have the examples, like this, in all directions.

This is partly because the const C string doesn't end up believing it has
any code associated with it (the PC relative load of the C string into a
register) and it gets attributed to the previous source line -- which seems
like a bug we would love to see fixed BTW

I assume that's the constant stuff we have, which is annoying &
problematic, yes - essentially loads of constants get coalesced at the
start of the basic block they're referenced in, instead of attributed to
(at least) the first place they're used.

-- so the line table entries look like:

0x09f0: main.c:15 // code for the loading of "var1 = %i, var2 = %i" in
incorrectly associated with the previous source line
0x1000: main.c:23 // var1 into register
0x1010: main.c:24 // var2 into register
0x1020: main.c:22 // call to printf

*nod*