[MC] Removing aligned bundling support

Portable Native Client has been deprecated for a long time. It hs an assembler extension:
https://www.chromium.org/nativeclient/pnacl/aligned-bundling-support-in-llvm/

Recently, @aengelke and I have made improvements to the performance of the LLVM integrated assembler.
During this refactoring, we encountered challenges with maintaining functionality related to aligned bundling.
The dozen Assembler.isBundlingEnabled() instances make certain code transform difficult.

The most prominent issue is that mergeFragment(...); delete DF in MCELFStreamer::emitInstToData makes it tricky to change the fragment list representation to an array.

Shall we remove support for it? See also State of NaCl in monorepo? @dschuff

There are two extra alignment-related features within LLVM.
They have their own requirements and cannot utilize aligned bundling. Their implementations are under target-specific llvm/lib/Target and do not interfere with the generic implementation that much.

  • x86 --x86-align-branch-boundary
  • llvm/lib/Target/Hexagon/MCTargetDesc/HexagonAsmBackend.cpp

Believe it or not, we are actually still using the aligned bundling support in upstream LLVM: Chrome still supports running NaCl modules on ChromeOS, and we use this support for building a NaCl-sandboxed component that ships with the browser. We keep this toolchain up-to-date with LLVM because it compiles some of the same Chromium code that the host compiler does. If bundling support were to be removed from upstream, we’d have to figure out a way to keep it working out-of-tree, which needless to say I would rather not do. But unlike in previous discussions, it sounds like there may actually be some sort of cost to non-NaCl users in keeping this support in-tree, so I’d like to understand this better. Maybe we can come up with a factoring that works for all the use cases. For example for the particular issue you mentioned, the code is inside a check for the relax-all flag, which I’m not sure we actually use; I’ll have to check. Hopefully we can work something out.

Thanks for the comment! It seems that we need aligned bundling support for extended period of time.

Clang -O0 used to enable -mrelax-all by default. After
[Driver] Don't default to -mrelax-all for non-RISCV -O0 by MaskRay · Pull Request #90013 · llvm/llvm-project · GitHub -mno-relax-all has been the default.
I personally believe -mrelax-all is not a useful option.

Perhaps the majority of the complexity of Assembler.isBundlingEnabled() stems from Assembler.getRelaxAll(). If aligned bundling requires -mno-relax-all, we can still drop a lot of complexity and probably not constrain the design of an array based fragment list.

If aligned bunding with -mrelax-all is no longer supported, the llvm-mc -mc-relax-all tests in llvm/test/MC/X86/AlignedBundling need to be removed.


Aligned bundling support with -mrelax-all was contributed by @petrhosek in
⚙ D8072 [MC] Write padding into fragments when -mc-relax-all flag is used

The original motivation was

We want to ensure that the memory usage of pnacl-llc (the PNaCl translator) is reasonable.

Some archaeology: The original motivation of ⚙ D8072 [MC] Write padding into fragments when -mc-relax-all flag is used was memory usage, which has been (at least partially) achieved from different angles with recent assembler improvement.

Created [MC] Aligned bundling: remove special handling for RelaxAll by MaskRay · Pull Request #95188 · llvm/llvm-project · GitHub to remove the special support. RelaxAll should still work, though the behavior might be slightly different.

⚙ D5915 [MC] Attach labels to existing fragments instead of using a separate fragment introduced flushPendingLabels for aligned bundling:

This was not only less efficient (e.g. jumping to the nops instead of past them) but also led to miscalculation of the address of the GOT (since MC uses a label difference rather than emitting a “.” symbol).

The code was later modified by

flushPendingLabels is called by many MCObjectStreamer functions, noticeably once for each new fragment.
I have analyzed that if we remove addPendingLabels/flushPendingLabels, we would have 2% more fragments, but avoiding flushPendingLabels might yield a performance advantage and would certainly simplify the assembler…

In addition, since symbols are no longer associated with a MCDummyFragment, MCExpr.cpp:AttemptToFoldSymbolOffsetDifference can be simplified.

I have removed flushPendingLabels and addPendingLabel in [MC] Remove pending labels · llvm/llvm-project@7500646 · GitHub .
Aligned bundling’s labeloffset.s is kept working using a different, much simpler approach.

There is some -O0 compile time decrease (larger than expected but nice!)
https://llvm-compile-time-tracker.com/compare.php?from=a091bfe71fdde4358dbfc73926f875cb05121c00&to=75006466296ed4b0f845cbbec4bf77c21de43b40&stat=instructions%3Au

This unlocks cleanup opportunities like [MC] AttemptToFoldSymbolOffsetDifference: remove MCDummyFragment chec… · llvm/llvm-project@8fa4fe1 · GitHub

Thanks for your patience here. We are also testing our build to make sure that it would be ok to remove support for aligned bundling withrelax-all and assuming it works out, I will remove that from upstream, which will maybe also allow some simplification.

Edit: I missed upthread that you already removed the special handling for relax-all, so I guess there’s nothing left to do here.