Debug info interacting with optimization and code generation

In theory, compiler should generate bit-identical code with and without debug info. I.e.

clang -c -O2 -g a.cc -o a.g.o

clang -c -O2 -g0 a.cc -o a.g0.o

strip a.g.o a.g0.o

diff a.g.o a.g0.o

The diff should find two binaries identical. For brevity, in the rest of the mail, I’ll refer to this requirement as “codegen consistency” (any better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently, I’ve spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I’m looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix.

I initially thought that it’s just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I’m calling the community for help:

  • Is there anyone else who also cares about codegen consistency?
  • Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the “objdump -d” output)
  • How to setup a regression test to ensure future changes does not break codegen consistency?

Any comments?

Thanks,
Dehao

In theory, compiler should generate bit-identical code with and without debug info. I.e.

clang -c -O2 -g a.cc -o a.g.o

clang -c -O2 -g0 a.cc -o a.g0.o

strip a.g.o a.g0.o

diff a.g.o a.g0.o

The diff should find two binaries identical. For brevity, in the rest of the mail, I’ll refer to this requirement as “codegen consistency” (any better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently, I’ve spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I’m looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix.

I initially thought that it’s just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I’m calling the community for help:

  • Is there anyone else who also cares about codegen consistency?
  • Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the “objdump -d” output)
  • How to setup a regression test to ensure future changes does not break codegen consistency?

Specific test cases would be checked in as usual - beyond that, probably a self-host that checks for consistency (like a 3 stage bootstrap checks that stage 2 and 3 are identical). Potentially other workloads could be added if a selfhost didn’t offer enough certainty for common cases.

It’s an abstract good/intended goal, for sure - but it’s not been a priority for anyone (as you’ve seen), so just hasn’t been pushed very hard/far.

  • Dave

In theory, compiler should generate bit-identical code with and without
debug info. I.e.
# clang -c -O2 -g a.cc -o a.g.o
# clang -c -O2 -g0 a.cc -o a.g0.o
# strip a.g.o a.g0.o
# diff a.g.o a.g0.o
The diff should find two binaries identical. For brevity, in the rest of
the mail, I'll refer to this requirement as "codegen consistency" (any
better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently,
I've spent quite some time try to fix related issues (e.g.
https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098).
The most recent issue I'm looking at is that during isel, the IROrder is
used by both debug info and the actual codegen, which is relative harder to
fix.

I initially thought that it's just a couple of careless bugs to fix. But
looks like there are much more issues than I expected. So I'm calling the
community for help:

* Is there anyone else who also cares about codegen consistency?
* Any volunteers to help fix codegen consistency issues? (It is easy to
find issues, just build speccpu with -g and -g0, then compare the "objdump
-d" output)
* How to setup a regression test to ensure future changes does not break
codegen consistency?

Specific test cases would be checked in as usual - beyond that, probably a
self-host that checks for consistency (like a 3 stage bootstrap checks that
stage 2 and 3 are identical). Potentially other workloads could be added if
a selfhost didn't offer enough certainty for common cases.

It's an abstract good/intended goal, for sure - but it's not been a
priority for anyone (as you've seen), so just hasn't been pushed very
hard/far.

I agree with you that this is a good/intended goal, but it is not
'abstract' good goal :slight_smile:

David

I wasn't aware of this problem as of late, but with our own compiler (for Hexagon) we've made efforts in the past to make sure that -g did not affect codegen. Some issues must have crept back in. This is definitely something that needs to be fixed.

-Krzysztof

Abstract in the sense that no one’s had concrete needs for/problems with it so far - or insufficiently so that it wasn’t worth working on.

(Resend with llvm-dev added back)

At Sony we have an internal test run that compares generated code with/without –g, in our suite of regression tests. See our lightning talk slides from EuroLLVM 2015. I believe we list some PRs in there for things we have found and fixed in the past.

http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf

At the moment we have a backlog of about a half-dozen differences worth investigating. I have to admit we have not yet looked at whether some of your recent work has fixed any of them; it is not our top priority, although obviously it is something we do look at and keep track of.

There are some very minor differences in instruction order that we see, and I think in most cases that is because –g emits .cfi directives which act as scheduling barriers. It might be the case that if we enabled exceptions, we would not see these as –g differences; we have not experimented with that.

–paulr

In theory, compiler should generate bit-identical code with and without debug info. I.e.
# clang -c -O2 -g a.cc -o a.g.o
# clang -c -O2 -g0 a.cc -o a.g0.o
# strip a.g.o a.g0.o
# diff a.g.o a.g0.o
The diff should find two binaries identical. For brevity, in the rest of the mail, I'll refer to this requirement as "codegen consistency" (any better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I'm looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix.

I initially thought that it's just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I'm calling the community for help:

* Is there anyone else who also cares about codegen consistency?

We have in the past always treated situations where the presence of debug info caused different code to be emitted as pretty serious bugs. Typically these bugs came from code that didn't properly skip over debug intrinsics when doing peephole-style transformations.

* Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the "objdump -d" output)

I certainly don't mind getting CC'ed on any PRs that we find :slight_smile:

-- adrian

A good start is to start file upstream bugs found in SPEC and clang self build. Once those bugs are fixed, we need to set up bots to do 3-stage bootstrap of clang to ensure no regressions are introduced.

David

Completely agree. Any code generation changes due to debug info are a bug and should be handled accordingly :slight_smile:

-eric

FWIW, the fix that Rob has just added a patch for (
https://reviews.llvm.org/D26554 ) fixes a case of debug info affecting
optimization, found using the utils/check_cfc tool from Russ's presentation
below on a large game codebase.