-gcolumn-info and PR 14106

For -Rpass, and other related uses, I am looking at enabling column info by default. David pointed me at PR 14106, which seems to be the original motivation for introducing -gcolumn-info. However, I am finding no differences when using it on this test. I’ve tried building with/without -gcolumn-info and found almost no difference in compile time (+0.4%):

$ /usr/bin/time clang -w -fno-builtin -O2 -g -gcolumn-info test-tgmath2.i
474.38user 2.10system 7:58.00elapsed 99%CPU

$ /usr/bin/time clang -w -fno-builtin -O2 -g test-tgmath2.i
472.63user 2.02system 7:56.11elapsed 99%CPU

I’m running clang from trunk @211693.

The size of all debug sections (according to readelf) are:

  • with -g -gcolumn-info: 836,177 bytes
  • with -g: 826,552 bytes

That’s a growth of about 1% in debug info size.

These numbers are in line with a comparative build I did of our internal codebase. The build included a massive number of C and C++ files. For C files, total file size grows by 1% on average. For C++ files the average growth is around 0.2%. Build times are unchanged as well.

Does anyone remember any other edge case I may want to try? It seems to me that these differences are not really worth the effort of having a flag controlling column information.

Thanks. Diego.

I remember it being larger last I'd looked, but I know I didn't have
quite the range that you just tested. I'm all for turning it on -
especially if it doesn't really cost us anything.

Adrian: any issues on your end before I turn it on by default and
remove the option? This will also obviate a few of the patches that
you had to distinguish call sites as well I think.

-eric

The main motivation for turning it off is that no known consumer (debugger) took advantage of it.

Turning it on does more than slightly increase the object file size, it can cause the same source line to be listed multiple times in the .debug_line table (with different column numbers). This can be confusing to debuggers that ignore column info, or possibly to the users who expect “set breakpoint on line 12” to set one breakpoint and it actually sets 4. “How come nothing happens when I hit Continue?”

Or it can affect single-stepping, which might stop at the next “is-statement” entry, even if the line hasn’t changed. “How come I have to hit Step 4 times to get it to go to the next line?”

For –Rpass and related uses, it might be useful to distinguish between tracking column numbers and emitting column numbers. IIUC –Rpass wants column info tracked during compilation so it can show the things it wants to show with maximum relevance. Whether those column numbers actually make it into the .debug_line section is a different story.

–paulr

For -Rpass, and other related uses, I am looking at enabling column info by
default. David pointed me at PR 14106, which seems to be the original
motivation for introducing -gcolumn-info. However, I am finding no
differences when using it on this test. I've tried building with/without
-gcolumn-info and found almost no difference in compile time (+0.4%):

$ /usr/bin/time clang -w -fno-builtin -O2 -g -gcolumn-info test-tgmath2.i
474.38user 2.10system 7:58.00elapsed 99%CPU

$ /usr/bin/time clang -w -fno-builtin -O2 -g test-tgmath2.i
472.63user 2.02system 7:56.11elapsed 99%CPU

I'm running clang from trunk @211693.

The size of all debug sections (according to readelf) are:

- with -g -gcolumn-info: 836,177 bytes
- with -g: 826,552 bytes

That's a growth of about 1% in debug info size.

These numbers are in line with a comparative build I did of our internal
codebase. The build included a massive number of C and C++ files. For C
files, total file size grows by 1% on average. For C++ files the average
growth is around 0.2%. Build times are unchanged as well.

Does anyone remember any other edge case I may want to try? It seems to me
that these differences are not really worth the effort of having a flag
controlling column information.

I remember it being larger last I'd looked, but I know I didn't have
quite the range that you just tested. I'm all for turning it on -
especially if it doesn't really cost us anything.

Adrian: any issues on your end before I turn it on by default and
remove the option? This will also obviate a few of the patches that
you had to distinguish call sites as well I think.

No, please do! I’ve been wanting to enable it by default for a long time. We can also get rid of all uses of forceColumnInfo this way.

-- adrian

The main motivation for turning it off is that no known consumer (debugger)
took advantage of it.

There are lots of non-debugger consumers of debug info, and now we've
got cases where we actively want it for those.

Turning it on does more than slightly increase the object file size, it can
cause the same source line to be listed multiple times in the .debug_line
table (with different column numbers). This can be confusing to debuggers
that ignore column info, or possibly to the users who expect “set breakpoint
on line 12” to set one breakpoint and it actually sets 4. “How come nothing
happens when I hit Continue?”

Honestly IMO this is a "please fix your debugger, it has bugs" issue.

-eric

The main motivation for turning it off is that no known consumer (debugger)
took advantage of it.

Turning it on does more than slightly increase the object file size, it can
cause the same source line to be listed multiple times in the .debug_line
table (with different column numbers). This can be confusing to debuggers
that ignore column info, or possibly to the users who expect “set breakpoint
on line 12” to set one breakpoint and it actually sets 4. “How come nothing
happens when I hit Continue?”

Yes. David is testing the gdb testsuite to see whether that's a real
problem. If that's a problem, it should show there (since GCC does not
emit column info in dwarf, AFAIR).

In any case, that would be a debugger issue. Not compiler.

For –Rpass and related uses, it might be useful to distinguish between
_tracking_ column numbers and _emitting_ column numbers. IIUC –Rpass wants
column info tracked during compilation so it can show the things it wants to
show with maximum relevance. Whether those column numbers actually make it
into the .debug_line section is a different story.

That's already done. -Rpass now enables a special loc tracking mode
that causes no dwarf generation. The issue is the combination of
-Rpass -g.

With -Rpass alone, turning on column info is fine (since no debug
output will be generated). However, -Rpass -g would be penalized since
no column info would be shown in that case.

Diego.

> Turning it on does more than slightly increase the object file size, it
can
> cause the same source line to be listed multiple times in the
.debug_line
> table (with different column numbers). This can be confusing to
debuggers
> that ignore column info, or possibly to the users who expect “set
breakpoint
> on line 12” to set one breakpoint and it actually sets 4. “How come
nothing
> happens when I hit Continue?”
>

Honestly IMO this is a "please fix your debugger, it has bugs" issue.

-eric

What the committee calls "a quality of implementation issue" :slight_smile:
No real argument from me. I've had that problem at previous jobs tho.
We currently assume there's no column info but I expect we can learn
to cope. People do get twitchy about size, but if it's really in the
1% range that's okay.
--paulr

> Turning it on does more than slightly increase the object file size, it
can
> cause the same source line to be listed multiple times in the
.debug_line
> table (with different column numbers). This can be confusing to
debuggers
> that ignore column info, or possibly to the users who expect “set
breakpoint
> on line 12” to set one breakpoint and it actually sets 4. “How come
nothing
> happens when I hit Continue?”
>

Honestly IMO this is a "please fix your debugger, it has bugs" issue.

-eric

What the committee calls "a quality of implementation issue" :slight_smile:

:slight_smile:

No real argument from me. I've had that problem at previous jobs tho.
We currently assume there's no column info but I expect we can learn
to cope. People do get twitchy about size, but if it's really in the
1% range that's okay.

Yeah. I was most worried about compile time and size, but it seems to
not be a problem.

-eric

The main motivation for turning it off is that no known consumer (debugger)
took advantage of it.

Turning it on does more than slightly increase the object file size, it can
cause the same source line to be listed multiple times in the .debug_line
table (with different column numbers). This can be confusing to debuggers
that ignore column info, or possibly to the users who expect “set breakpoint
on line 12” to set one breakpoint and it actually sets 4. “How come nothing
happens when I hit Continue?”

Yes. David is testing the gdb testsuite to see whether that's a real
problem. If that's a problem, it should show there (since GCC does not
emit column info in dwarf, AFAIR).

FWIW, a basic run seems to show a handful (so not a pervasive issue)
of new failures:

FAIL: gdb.base/skip.exp: step after disabling 3 (3)
FAIL: gdb.reverse/step-precsave.exp: reverse step out of called fn
FAIL: gdb.reverse/step-precsave.exp: reverse next over call
FAIL: gdb.reverse/step-precsave.exp: reverse step test 1
FAIL: gdb.reverse/step-precsave.exp: reverse next test 1
FAIL: gdb.reverse/step-precsave.exp: reverse next test 2
FAIL: gdb.reverse/step-reverse.exp: reverse step out of called fn
FAIL: gdb.reverse/step-reverse.exp: reverse next over call
FAIL: gdb.reverse/step-reverse.exp: reverse step test 1
FAIL: gdb.reverse/step-reverse.exp: reverse next test 1
FAIL: gdb.reverse/step-reverse.exp: reverse next test 2

I haven't looked at why they're failing (if you'd like to reproduce
them & look into it, I can give you some pointers), though FWIW the
reverse debugging scenarios are usually a bit incompatible with clang
due to where clang likes to put the trailing breakpoint in a function
(return statement versus close brace).

For -Rpass, and other related uses, I am looking at enabling column info
by default. David pointed me at PR 14106, which seems to be the original
motivation for introducing -gcolumn-info. However, I am finding no
differences when using it on this test. I've tried building with/without
-gcolumn-info and found almost no difference in compile time (+0.4%):

$ /usr/bin/time clang -w -fno-builtin -O2 -g -gcolumn-info test-tgmath2.i

474.38user 2.10system 7:58.00elapsed 99%CPU

$ /usr/bin/time clang -w -fno-builtin -O2 -g test-tgmath2.i
472.63user 2.02system 7:56.11elapsed 99%CPU

I'm running clang from trunk @211693.

The size of all debug sections (according to readelf) are:

- with -g -gcolumn-info: 836,177 bytes
- with -g: 826,552 bytes

That's a growth of about 1% in debug info size.

What about the difference between -gmlt and -gmlt + -gcolumn-info?

Debug info growth is larger in proportion (3%). Total binary size is
roughly the same (~1%). Compile times are identical (7m47s for both).

- with -gmlt -gcolumn-info: 319,792 bytes
- with -gmlt: 310,243 bytes

Diego.

>
>>
>> For -Rpass, and other related uses, I am looking at enabling column info
>> by default. David pointed me at PR 14106, which seems to be the original
>> motivation for introducing -gcolumn-info. However, I am finding no
>> differences when using it on this test. I've tried building
with/without
>> -gcolumn-info and found almost no difference in compile time (+0.4%):
>>
>> $ /usr/bin/time clang -w -fno-builtin -O2 -g -gcolumn-info
test-tgmath2.i
>> 474.38user 2.10system 7:58.00elapsed 99%CPU
>>
>> $ /usr/bin/time clang -w -fno-builtin -O2 -g test-tgmath2.i
>> 472.63user 2.02system 7:56.11elapsed 99%CPU
>>
>> I'm running clang from trunk @211693.
>>
>> The size of all debug sections (according to readelf) are:
>>
>> - with -g -gcolumn-info: 836,177 bytes
>> - with -g: 826,552 bytes
>>
>> That's a growth of about 1% in debug info size.
>
>
> What about the difference between -gmlt and -gmlt + -gcolumn-info?

Debug info growth is larger in proportion (3%). Total binary size is
roughly the same (~1%). Compile times are identical (7m47s for both).

- with -gmlt -gcolumn-info: 319,792 bytes
- with -gmlt: 310,243 bytes

Looks good. I think we can live with that increase. I'm certainly not
opposed
to enabling -gcolumn-info by default. For example, adding it will
automatically
give us column numbers in stack traces in sanitizer error reports.

Interesting. I am not seeing these new failures in gdb trunk as of
last Fri. With or without -gcolumn-info, I get the same set of passes
and failures in gdb.reverse:

PASS: gdb.reverse/step-precsave.exp: Turn on process record
PASS: gdb.reverse/step-precsave.exp: BP at end of main
PASS: gdb.reverse/step-precsave.exp: run to end of main
PASS: gdb.reverse/step-precsave.exp: save process recfile
PASS: gdb.reverse/step-precsave.exp: Kill process, prepare to debug log file
PASS: gdb.reverse/step-precsave.exp: reload core file
PASS: gdb.reverse/step-precsave.exp: next test 1
PASS: gdb.reverse/step-precsave.exp: step test 1
PASS: gdb.reverse/step-precsave.exp: next test 2
PASS: gdb.reverse/step-precsave.exp: step test 2
PASS: gdb.reverse/step-precsave.exp: step up to call
PASS: gdb.reverse/step-precsave.exp: next over call
PASS: gdb.reverse/step-precsave.exp: step into call
PASS: gdb.reverse/step-precsave.exp: finish out of fn call
PASS: gdb.reverse/step-precsave.exp: simple stepi
PASS: gdb.reverse/step-precsave.exp: stepi into function call
PASS: gdb.reverse/step-precsave.exp: stepi back from function call
PASS: gdb.reverse/step-precsave.exp: set reverse execution
PASS: gdb.reverse/step-precsave.exp: reverse stepi thru function return
FAIL: gdb.reverse/step-precsave.exp: reverse stepi from a function
call (start statement)
FAIL: gdb.reverse/step-precsave.exp: simple reverse stepi
PASS: gdb.reverse/step-precsave.exp: reverse step into fn call
FAIL: gdb.reverse/step-precsave.exp: reverse step out of called fn
FAIL: gdb.reverse/step-precsave.exp: reverse next over call
FAIL: gdb.reverse/step-precsave.exp: reverse step test 1
FAIL: gdb.reverse/step-precsave.exp: reverse next test 1
FAIL: gdb.reverse/step-precsave.exp: reverse step test 2
FAIL: gdb.reverse/step-precsave.exp: reverse next test 2

Similarly, with gdb.base. Both variants pass and fail the same tests.

Now, there is a boatload of gdb failures with Clang when compared with
GCC. But I suppose those are part of our baseline? (attached)

Should I prepare a patch to make -gcolumn-info the default, or do you
prefer to take over?

Thanks. Diego.

00clang-gdb-failures.txt (43.4 KB)