[RFC] Should we bump the bitcode version in LLVM 6.0?

Hi,

TL;DR
r317488 changed the way fast math flags are laid out in the bitcode and anyone compiling a pre-llvm-6.0 bitcode with llvm-6.0 will lose all the optimizations guarded by isFast and a pre-llvm-6.0 compiler compiling a llvm-6.0 bitcode will potentially generate incorrect code w.r.t. fast math expectations.

Should we bump the bitcode version because of that and have the autoupgrader properly rewrite the fast-math to preserve that semantic?
(I believe we should!)

  • Context *

With https://reviews.llvm.org/D39304 / r317488 we got rid of the umbrella UnsafeMath flag and introduced 3 more flags that better represent the different things that happen under fast-math.

From a bitcode perspective, this change looks like this:
Before r317488 we had 6 bits that respectively represented:

UnsafeMath
nnan
ninf
nsz
arcp
contract
unset

(The order may not match what is exactly in the bitcode.)

After r317488 we had 7 bits that respectively represented:

reassoc (-UnsafeMath- is gone)
nnan
ninf
nsz
arcp
contract

afn (new bit)

Before r317488, fast-math was true if UnsafeMath was true (this should also imply all the other flags are sets). After r317488, fast-math is true if all the bits are set, in particular the afn, new one, too.

  • Problem *

Given we currently have no way to check if a bitcode file has been generated pre-r317488 or post-r317488 that means that:

  1. a post-r317488 compiler is going to skip any optimization guarded by isFast for all pre-r317488 bitcode file (remember the afn bit is not set here)
  2. a pre-r317488 compiler is going to run any optimization guarded by unsafeAlgebra for any post-r317488 bitcode file that has the reassoc bit (remember we repurposed UnsafeMath)

Scenario #2 might be unlikely but we’re potentially breaking the semantic of the program. It is particularly dangerous because there is nothing that is going to tell us that we are in this situation “downgrade" situation.
#1 means that any code that uses unsafeMath is going to get a performance hit.

In other words, one scenario implies generating wrong code and the other, runtime performance regressions.

  • Feedback Needed *

I believe this change is big enough that it would be worth bumping the bitcode version so that the upgrader can do the right thing before we release it to the public with LLVM-6.0.

That being said, I don’t know what are the implications of such bump and if people really don’t care about the performance problem that might be okay. The silent downgrade path is however concerning.

Should we bump the bitcode version because of that change and have the autoupgrader properly rewrite the fast-math flags to
preserve the semantic and make sure there are no silent downgrade?

Thanks,
-Quentin

Hi Quentin,

r317488 changed the way fast math flags are laid out in the bitcode and anyone
compiling a pre-llvm-6.0 bitcode with llvm-6.0 will lose all the optimizations guarded
by isFast and a pre-llvm-6.0 compiler compiling a llvm-6.0 bitcode will potentially
generate incorrect code w.r.t. fast math expectations.

Should we bump the bitcode version because of that and have the autoupgrader properly
rewrite the fast-math to preserve that semantic?
(I believe we should!)

I agree.

Since I was involved in the discussions that motivated the change of r317488, I feel compelled to chime in.

The danger you described of a pre-llvm-6.0 compiler compiling llvm-6.0 bitcode with the reassoc bit enabled (but not the entire set of fast-math-flags enabled), incorrectly behaving as though all of fast-math is enabled, is pretty compelling.

As an aside, at my employer (Sony), this isn't a critical issue for our product. But that's just because out of an abundance of caution, we only permit LTO using bitcode files built using the same llvm version, irrespective of the bitcode version of the IR. That is, when doing LTO, even if the bitcode version isn't bumped, our llvm-5.0 based compiler will reject llvm-6.0 bitcode, and our llvm-6.0 based compiler will reject llvm-5.0 bitcode. So if there is a strong feeling by others that the possibility of users compiling new bitcode with an older compiler is insignificant, then I won't argue that we must bump the IR version markers.

But given your points, it seems to me bumping the bitcode version is the “right thing to do”. And from the discussion at https://reviews.llvm.org/D39304, apparently Michael already has the patch ready, so this can be addressed quickly.

Thanks,
-Warren

Just wanted to point out part of this even becoming a problem is the use of isFast().
There should be warnings against using isFast() and the existing code should be changed to query specific flags instead…

  • Matthias

Does the language reference need to be updated?

http://llvm.org/docs/LangRef.html#fast-math-flags

It still mentions “fast”

I agree about isFast(), but determining the minimal subset of flags needed to replace those calls requires some thought.

I did start looking at replacing isFast() in the reassociation pass. In theory, we should be able to do those transforms with just ‘reassoc’ now, but I hit complications with intersecting FMF of different instructions.

Conservatively, we could just ask if everything but the new ‘afn’ is set for folds that do not include mathlib or intrinsic math functions. That could be an alternative to versioning the IR…unless the perf problems are actually caused by not folding mathlib calls now?

As I said in D39304, I have no objection to versioning the IR, but I also don’t know if there’s a cost/downside to that.

No, I think that’s ok as-is. We still allow ‘fast’ in IR text as a shortcut meaning all of the other bits are set. It’s just encoded differently in binary form.

Agree, but that wouldn’t solve the downgrade problem plus this is probably too late to fix LLVM 6.0 for all the isFast uses.

Regarding:

Just wanted to point out part of this even becoming a problem is the use of `isFast()`.
There should be warnings against using isFast() and the existing code should be
changed to query specific flags instead...

Agree, but that wouldn’t solve the downgrade problem plus this is probably too late to
fix LLVM 6.0 for all the isFast uses.

Just to be explicit, the point that the use of `isFast()` is a problem isn't new. I agree it's too late to fix it for LLVM 6.0, but that problem exists in earlier releases.

Prior to r317488 the problem already existed, except it was the use of `unsafeAlgebra()` that was the problem (that is, many of those really should have been changed to query the specific flags). But prior to r317488 the situation was worse, because the umbrella aspect of the `UnsafeAlgebra` flag essentially made it impossible to "subtract" a precise portion of the FastMathFlags.

In short, before and after r317488, `-ffast-math` on its own worked well, and still works well. But `-ffast-math <minus_some_portion>` (e.g., `-ffast-math -fno-reciprocal-math`) didn't work well prior to r317488, and was more or less impossible to make work well until the re-work of r317488. FTR, `-ffast-math <minus_some_portion>` has never worked well. The “way it doesn't work well” has changed over time. (In particular, going further back, pre-r297837 didn't work, but it didn't work in a way that's different from the way it doesn't work now. r297837 and r317488 both are steps to opening the door to allow changes that will make it work.)

Unless I'm missing something, when using bitcode from the "consistent" compiler (that is, pre-r317488 bitcode with a pre-r317488 compiler, and post-r317488 bitcode with a post-r317488 compiler), there are no examples of test-cases that are broken by r317488.

Which brings us back to Quentin's original question:
Should we bump the bitcode version and have the upgrader do the right thing (and do that before releasing llvm 6.0, making this question fairly urgent).

My opinion is we should.

Thanks,
-Warren

Scenario #1 is unsupported AFAIK, unless I missed something the bitcode is
not forward compatible: loading newer bitcode with an older LLVM has never
been supported as far I can remember.

Scenario #2 is very much like other performance regression when we drop old
metadata (i.e. bitcode upgrade isn't performance proof in general but only
"best effort", there have been multiple instance of this in the past).

Usually IIRC we try not to version the bitcode at all this way (i.e.
bitcode does not have a linear versioning that is regularly bumped) but
instead make sure the encoding itself allows an easy upgrade. I.e. the
encoding of the FMF should have been such that the reader can detect and
upgrade to the new IR representation.
Now this is too late here I guess, so bumping may be a possible trade-off.
What about any bitcode shipped after r317488 but before the version bump?
Not worth taking into account because of the short period of time?

Best,

For all who want to participate in the discussion there is a review open now regarding this topic:

https://reviews.llvm.org/D43253

Regards,
Michael

Hi Mehdi,

2018-02-08 17:34 GMT-08:00 Quentin Colombet via llvm-dev <llvm-dev@lists.llvm.org>:

Hi,

TL;DR
r317488 changed the way fast math flags are laid out in the bitcode and anyone compiling a pre-llvm-6.0 bitcode with llvm-6.0 will lose all the optimizations guarded by isFast and a pre-llvm-6.0 compiler compiling a llvm-6.0 bitcode will potentially generate incorrect code w.r.t. fast math expectations.

Should we bump the bitcode version because of that and have the autoupgrader properly rewrite the fast-math to preserve that semantic?
(I believe we should!)

  • Context *

With https://reviews.llvm.org/D39304 / r317488 we got rid of the umbrella UnsafeMath flag and introduced 3 more flags that better represent the different things that happen under fast-math.

From a bitcode perspective, this change looks like this:
Before r317488 we had 6 bits that respectively represented:

UnsafeMath
nnan
ninf
nsz
arcp
contract
unset

(The order may not match what is exactly in the bitcode.)

After r317488 we had 7 bits that respectively represented:

reassoc (-UnsafeMath- is gone)
nnan
ninf
nsz
arcp
contract

afn (new bit)

Before r317488, fast-math was true if UnsafeMath was true (this should also imply all the other flags are sets). After r317488, fast-math is true if all the bits are set, in particular the afn, new one, too.

  • Problem *

Given we currently have no way to check if a bitcode file has been generated pre-r317488 or post-r317488 that means that:

  1. a post-r317488 compiler is going to skip any optimization guarded by isFast for all pre-r317488 bitcode file (remember the afn bit is not set here)
  2. a pre-r317488 compiler is going to run any optimization guarded by unsafeAlgebra for any post-r317488 bitcode file that has the reassoc bit (remember we repurposed UnsafeMath)

Scenario #2 might be unlikely but we’re potentially breaking the semantic of the program. It is particularly dangerous because there is nothing that is going to tell us that we are in this situation “downgrade" situation.
#1 means that any code that uses unsafeMath is going to get a performance hit.

In other words, one scenario implies generating wrong code and the other, runtime performance regressions.

Scenario #1 is unsupported AFAIK, unless I missed something the bitcode is not forward compatible: loading newer bitcode with an older LLVM has never been supported as far I can remember.

I agree, but we cannot detect that this is the situation we are in and debugging would prove hard. That said, if we say we don’t care, so be it :).

Scenario #2 is very much like other performance regression when we drop old metadata (i.e. bitcode upgrade isn’t performance proof in general but only “best effort”, there have been multiple instance of this in the past).

Usually IIRC we try not to version the bitcode at all this way (i.e. bitcode does not have a linear versioning that is regularly bumped) but instead make sure the encoding itself allows an easy upgrade. I.e. the encoding of the FMF should have been such that the reader can detect and upgrade to the new IR representation.
Now this is too late here I guess, so bumping may be a possible trade-off.

Do you think we should do it, or live with the performance drop?
I know this performance drop is not acceptable for our use cases, but I don’t want to impose our ruling on this.

What about any bitcode shipped after r317488 but before the version bump? Not worth taking into account because of the short period of time?

I was hoping we can ignore that period of time because it was not in any official LLVM release.

Best,
-Quentin

Hi Mehdi,

2018-02-08 17:34 GMT-08:00 Quentin Colombet via llvm-dev <
llvm-dev@lists.llvm.org>:

Hi,

TL;DR
r317488 changed the way fast math flags are laid out in the bitcode and
anyone compiling a pre-llvm-6.0 bitcode with llvm-6.0 will lose all the
optimizations guarded by isFast and a pre-llvm-6.0 compiler compiling a
llvm-6.0 bitcode will potentially generate incorrect code w.r.t. fast math
expectations.

Should we bump the bitcode version because of that and have the
autoupgrader properly rewrite the fast-math to preserve that semantic?
(I believe we should!)

* Context *

With ⚙ D39304 [IR] redefine 'reassoc' fast-math-flag and add 'trans' fast-math-flag / r317488 we got rid of the
umbrella UnsafeMath flag and introduced 3 more flags that better represent
the different things that happen under fast-math.

From a bitcode perspective, this change looks like this:
Before r317488 we had 6 bits that respectively represented:

UnsafeMath
nnan
ninf
nsz
arcp
contract
*unset*

(The order may not match what is exactly in the bitcode.)

After r317488 we had 7 bits that respectively represented:
reassoc (-UnsafeMath- is gone)
nnan
ninf
nsz
arcp
contract
*afn* (new bit)

Before r317488, fast-math was true if UnsafeMath was true (this should
also imply all the other flags are sets). After r317488, fast-math is true
if all the bits are set, in particular the afn, new one, too.

* Problem *

Given we currently have no way to check if a bitcode file has been
generated pre-r317488 or post-r317488 that means that:
1. a post-r317488 compiler is going to skip any optimization guarded by
isFast for all pre-r317488 bitcode file (remember the afn bit is not set
here)
2. a pre-r317488 compiler is going to run any optimization guarded by
unsafeAlgebra for any post-r317488 bitcode file that has the reassoc bit
(remember we repurposed UnsafeMath)

Scenario #2 might be unlikely but we’re potentially breaking the semantic
of the program. It is particularly dangerous because there is nothing that
is going to tell us that we are in this situation “downgrade" situation.
#1 means that any code that uses unsafeMath is going to get a performance
hit.

In other words, one scenario implies generating wrong code and the other,
runtime performance regressions.

Scenario #1 is unsupported AFAIK, unless I missed something the bitcode is
not forward compatible: loading newer bitcode with an older LLVM has never
been supported as far I can remember.

I agree, but we cannot detect that this is the situation we are in and
debugging would prove hard. That said, if we say we don’t care, so be it :).

I just mean that in general many things can blow up when the bitcode
evolves, and we don't expect or support (as far as I know) llvm-3.9 to be
used to read/process bitcode from llvm-4.0 for instance.
(I remember implementing in the Apple LLVM a hard check and failure for
this: we don't even try)

Scenario #2 is very much like other performance regression when we drop
old metadata (i.e. bitcode upgrade isn't performance proof in general but
only "best effort", there have been multiple instance of this in the past).

Usually IIRC we try not to version the bitcode at all this way (i.e.
bitcode does not have a linear versioning that is regularly bumped) but
instead make sure the encoding itself allows an easy upgrade. I.e. the
encoding of the FMF should have been such that the reader can detect and
upgrade to the new IR representation.
Now this is too late here I guess, so bumping may be a possible trade-off.

Do you think we should do it, or live with the performance drop?

I'm not saying this shouldn't be done, I don't have a dog in this fight: I
was just providing another data point "as a matter of general policy" in
the past there have been other cases where we haven't been able (or rather
chose to not put effort) into preserving all the performance when upgrading
(I think TBAA comes to my mind right now, but other kind of metadata have
also been lost). Now every case is different, and the FMF can be considered
too important to be left on the table.
For example it would impact LTO when loading libraries built with Xcode9
and before whenever the next released post-r317488 come up.

I know this performance drop is not acceptable for our use cases, but I
don’t want to impose our ruling on this.

What about any bitcode shipped after r317488 but before the version bump?
Not worth taking into account because of the short period of time?

I was hoping we can ignore that period of time because it was not in any
official LLVM release.

Very possible, I just wanted to make sure this is a conscientious choice :slight_smile:

Cheers,

Need to at least make sure everyone knows what to expect, so the
cherry-pick for the LLVM 6.0 release should include an entry in the Release
Notes.

Hans: this can only happen if you're still OK to cherry-pick in LLVM 6.0
branch ⚙ D43253 bitcode support change for fast flags compatibility

Duncan, Steven, Peter, Teresa: this change can break LTO if we read back
bitcode emitted from trunk since r317488 (last November), any concerns?

Best,

Hi Mehdi,

2018-02-08 17:34 GMT-08:00 Quentin Colombet via llvm-dev <
llvm-dev@lists.llvm.org>:

Hi,

TL;DR
r317488 changed the way fast math flags are laid out in the bitcode and
anyone compiling a pre-llvm-6.0 bitcode with llvm-6.0 will lose all the
optimizations guarded by isFast and a pre-llvm-6.0 compiler compiling a
llvm-6.0 bitcode will potentially generate incorrect code w.r.t. fast math
expectations.

Should we bump the bitcode version because of that and have the
autoupgrader properly rewrite the fast-math to preserve that semantic?
(I believe we should!)

* Context *

With https://reviews.llvm.org/D39304 / r317488 we got rid of the
umbrella UnsafeMath flag and introduced 3 more flags that better represent
the different things that happen under fast-math.

From a bitcode perspective, this change looks like this:
Before r317488 we had 6 bits that respectively represented:

UnsafeMath
nnan
ninf
nsz
arcp
contract
*unset*

(The order may not match what is exactly in the bitcode.)

After r317488 we had 7 bits that respectively represented:
reassoc (-UnsafeMath- is gone)
nnan
ninf
nsz
arcp
contract
*afn* (new bit)

Before r317488, fast-math was true if UnsafeMath was true (this should
also imply all the other flags are sets). After r317488, fast-math is true
if all the bits are set, in particular the afn, new one, too.

* Problem *

Given we currently have no way to check if a bitcode file has been
generated pre-r317488 or post-r317488 that means that:
1. a post-r317488 compiler is going to skip any optimization guarded by
isFast for all pre-r317488 bitcode file (remember the afn bit is not set
here)
2. a pre-r317488 compiler is going to run any optimization guarded by
unsafeAlgebra for any post-r317488 bitcode file that has the reassoc bit
(remember we repurposed UnsafeMath)

Scenario #2 might be unlikely but we’re potentially breaking the
semantic of the program. It is particularly dangerous because there is
nothing that is going to tell us that we are in this situation “downgrade"
situation.
#1 means that any code that uses unsafeMath is going to get a
performance hit.

In other words, one scenario implies generating wrong code and the
other, runtime performance regressions.

Scenario #1 is unsupported AFAIK, unless I missed something the bitcode
is not forward compatible: loading newer bitcode with an older LLVM has
never been supported as far I can remember.

I agree, but we cannot detect that this is the situation we are in and
debugging would prove hard. That said, if we say we don’t care, so be it :).

I just mean that in general many things can blow up when the bitcode
evolves, and we don't expect or support (as far as I know) llvm-3.9 to be
used to read/process bitcode from llvm-4.0 for instance.
(I remember implementing in the Apple LLVM a hard check and failure for
this: we don't even try)

Scenario #2 is very much like other performance regression when we drop
old metadata (i.e. bitcode upgrade isn't performance proof in general but
only "best effort", there have been multiple instance of this in the past).

Usually IIRC we try not to version the bitcode at all this way (i.e.
bitcode does not have a linear versioning that is regularly bumped) but
instead make sure the encoding itself allows an easy upgrade. I.e. the
encoding of the FMF should have been such that the reader can detect and
upgrade to the new IR representation.
Now this is too late here I guess, so bumping may be a possible
trade-off.

Do you think we should do it, or live with the performance drop?

I'm not saying this shouldn't be done, I don't have a dog in this fight:
I was just providing another data point "as a matter of general policy" in
the past there have been other cases where we haven't been able (or rather
chose to not put effort) into preserving all the performance when upgrading
(I think TBAA comes to my mind right now, but other kind of metadata have
also been lost). Now every case is different, and the FMF can be considered
too important to be left on the table.
For example it would impact LTO when loading libraries built with Xcode9
and before whenever the next released post-r317488 come up.

I know this performance drop is not acceptable for our use cases, but I
don’t want to impose our ruling on this.

What about any bitcode shipped after r317488 but before the version
bump? Not worth taking into account because of the short period of time?

I was hoping we can ignore that period of time because it was not in any
official LLVM release.

Need to at least make sure everyone knows what to expect, so the
cherry-pick for the LLVM 6.0 release should include an entry in the Release
Notes.

Hans: this can only happen if you're still OK to cherry-pick in LLVM 6.0
branch https://reviews.llvm.org/D43253

Duncan, Steven, Peter, Teresa: this change can break LTO if we read back
bitcode emitted from trunk since r317488 (last November), any concerns?

For our usage of ThinLTO, we don't reuse bitcode across different
revisions, so it wouldn't cause an issue. The LTO caching in
lib/LTO/LTO.cpp also encodes the llvm revision, so it shouldn't be an issue
there either.

Teresa

Thanks to all a better fix landed on both trunk and LLVM 6.0!