The state of ARMConstantIslandPass in 4.0.[01]

Hi,

We at Rust are using LLVM for our codegen. Since Rust version 1.19.0,
which should ship in July 21st, we are using a slightly patched
version of LLVM 4.0.

The transition to LLVM 4.0 had been fairly pain-free on x86 (at least
excluding the standard suite of assertion failures using MSVC SEH, but
that's fairly under control), but we have encountered several scary
bugs in ARM, most specifically in ARMConstantIslandPass.cpp:

- https://github.com/rust-lang/rust/issues/41672
- https://github.com/rust-lang/rust/issues/42248

The former of these is a fairly benign assertion failure, but the
latter is a scary "in the wild" wrong codegen bug, which can really
mess up with programmers.

Both of the bugs have fixes in trunk:
r295964 - Fix assertion failure in ARMConstantIslandPass.
r297871 - ARM: avoid clobbering register in v6 jump-table expansion.

I can easily backport the fixes for these 2 bugs into our LLVM branch.
However, while looking at the diff between 4.0 and trunk I've seen
several more fixes to "scary-looking" bugs:

r294949 - [Thumb-1] TBB generation: spot redefinitions of index register
r295816 - [ARM] Fix constant islands pass.
r299634 - [ARM] Remove a dead ADD during the creation of TBBs
    (that one looks more like an optimization than a wrong-codegen
thing, but including for completeness)
r300870 - [Thumb-1] Fix corner cases for compressed jump tables

After seeing the previous ones, I'm worried these other fixes might
also fix some "in the wild" wrong codegen bugs. However, I'm not
comfortable backporting so many patches by myself. We are open to
moving to being based on LLVM 4.0.1 when it is released, but it does
not seem to include *any* of the above fixes.

Are these fixes important? Should we backport them? Do we need to keep
anything in mind?

Thanks in advance,
    - Ariel

Hi,

We at Rust are using LLVM for our codegen. Since Rust version 1.19.0,
which should ship in July 21st, we are using a slightly patched
version of LLVM 4.0.

The transition to LLVM 4.0 had been fairly pain-free on x86 (at least
excluding the standard suite of assertion failures using MSVC SEH, but
that's fairly under control), but we have encountered several scary
bugs in ARM, most specifically in ARMConstantIslandPass.cpp:

- https://github.com/rust-lang/rust/issues/41672
- https://github.com/rust-lang/rust/issues/42248

The former of these is a fairly benign assertion failure, but the
latter is a scary "in the wild" wrong codegen bug, which can really
mess up with programmers.

Both of the bugs have fixes in trunk:
r295964 - Fix assertion failure in ARMConstantIslandPass.
r297871 - ARM: avoid clobbering register in v6 jump-table expansion.

I can easily backport the fixes for these 2 bugs into our LLVM branch.
However, while looking at the diff between 4.0 and trunk I've seen
several more fixes to "scary-looking" bugs:

r294949 - [Thumb-1] TBB generation: spot redefinitions of index register
r295816 - [ARM] Fix constant islands pass.
r299634 - [ARM] Remove a dead ADD during the creation of TBBs
    (that one looks more like an optimization than a wrong-codegen
thing, but including for completeness)
r300870 - [Thumb-1] Fix corner cases for compressed jump tables

After seeing the previous ones, I'm worried these other fixes might
also fix some "in the wild" wrong codegen bugs. However, I'm not
comfortable backporting so many patches by myself. We are open to
moving to being based on LLVM 4.0.1 when it is released, but it does
not seem to include *any* of the above fixes.

Are these fixes important? Should we backport them? Do we need to keep
anything in mind?

It's probably too late for these to make it into the 4.0.1 release,
since the merge deadline is today, but please submit merge requests
for these anyway (preferred way is to use the utils/release/merge-request.sh
script), because it will help us determine if we need a 4.0.2 release.

Thanks,
Tom

Opened merge requests:
https://bugs.llvm.org/show_bug.cgi?id=33213
https://bugs.llvm.org/show_bug.cgi?id=33214
https://bugs.llvm.org/show_bug.cgi?id=33215
https://bugs.llvm.org/show_bug.cgi?id=33216
https://bugs.llvm.org/show_bug.cgi?id=33217
https://bugs.llvm.org/show_bug.cgi?id=33218

Hi Ariel,

Those fixes are important, but as Tom said, only for 4.0.2 if we do one.

However, they'll need to be looked at with care, as they fix some odd
bugs in the compiler and I'm not sure how stable they really are.

They may behave well in trunk because of other patches that were
noticed and fixed, but back-porting just those patches may introduce
instabilities that weren't fixed in 4.0.

Given that we release a new version every 6 months, and that we
(Still) don't have (community) support for older versions, you may be
better off using 5.0 instead.

Thanks for creating the merge requests, though. Let's look at them
when/if 4.0.2 comes along.

cheers,
--renato

Are these fixes important? Should we backport them? Do we need to keep
anything in mind?

Hi Ariel,

Those fixes are important, but as Tom said, only for 4.0.2 if we do one.

However, they'll need to be looked at with care, as they fix some odd
bugs in the compiler and I'm not sure how stable they really are.

They may behave well in trunk because of other patches that were
noticed and fixed, but back-porting just those patches may introduce
instabilities that weren't fixed in 4.0.

Given that we release a new version every 6 months, and that we
(Still) don't have (community) support for older versions, you may be
better off using 5.0 instead.

Thanks for creating the merge requests, though. Let's look at them
when/if 4.0.2 comes along.

cheers,
--renato

They may behave well in trunk because of other patches that were

noticed and fixed, but back-porting just those patches may introduce
instabilities that weren't fixed in 4.0.

That's what's worrying me. Because we don't want to ship a version of
Rust on which developing on ARM is a minefield, we have to apply the
first 2 patches. However, it looks that we might also need to apply
the other patches, and I'm worried about whether and how can they be
safely backported. Could the patch authors pitch in?

Given that we release a new version every 6 months, and that we (Still) don't have (community) support for older versions, you may be better off using 5.0 instead.

In our experience, every new version of LLVM both requires some
porting work and introduces new issues in non-P1 targets that need to
be fixed. We'll upgrade to LLVM 5.0 as soon as it gets released and we
port and apply the fixes. In the meanwhile we'll like a version in
which ARM is less of a minefield.

- Ariel

They may behave well in trunk because of other patches that were

noticed and fixed, but back-porting just those patches may introduce
instabilities that weren't fixed in 4.0.

That's what's worrying me. Because we don't want to ship a version of
Rust on which developing on ARM is a minefield, we have to apply the
first 2 patches. However, it looks that we might also need to apply
the other patches, and I'm worried about whether and how can they be
safely backported.

This is unfortunate, but it happens from time to time.

If the problem was found in 4.0.0, we'd probably merge all the fixes,
because the tree was still fresh. But now months have passed, and we
need to be extra careful.

Could the patch authors pitch in?

Of course they can. But the decision to merge or not will always falls
back on the stable release manager, with inputs from the code owners
and the authors.

The primary concern of the stable releases is not to have a stable
product as in "works robustly and is free of egregious bugs in all
primary architectures", but as in "has the same behaviour as the
previous stable release with bug fixes".

If the bug fixes introduce instability, they are not back ported. We
want bugs fixed with minimised risk of introducing other unknown bugs
on other parts of the compiler.

In essence, better to have a known bugs than unknown ones.

In our experience, every new version of LLVM both requires some
porting work and introduces new issues in non-P1 targets that need to
be fixed. We'll upgrade to LLVM 5.0 as soon as it gets released and we
port and apply the fixes. In the meanwhile we'll like a version in
which ARM is less of a minefield.

Every new version of LLVM has issues on all targets. Like x86, the ARM
targets are primary, but they may have a bit less "user validation".

The constant island pass is special because it was improved not long
ago, and the original implementation was very... simple. The
improvement was done with some testing, but as usual, testing on ARM
is a never ending story.

A toolchain that is trying to target all ARM targets, from ARMv4 to
ARMv8, from bare metal to enterprise, on Linux, Mac and Windows,
*will* have gaps in the testing. x86, OTOH, has a lot less dimensions
and a lot more people building on it daily for very different
purposes.

Unfortunately, the policy around major releases and stable releases is
one that has been beaten repeatedly and every time the consensus is
the same: Every interested party validates their own trunk, for their
own purposes, and the upstream releases are just "a point in time that
had some more attention" not a "guarantee of stability and quality".

With that in mind, it's not unlikely that big problems will arise and
we'll encourage you to wait for the next release. Worse case scenario,
you can pick 3.9 instead, or take current trunk and validate locally.

cheers,
--renato