ARM64 -> AArch64 merge status

Hi all,

It’s been two weeks since I sent the last merge progress email, so here is an update.

TL;DR: Almost done!

Tim is considering suggesting making the final switchover sometime next week. This would be the final push, where AArch64 gets deleted and ARM64 gets renamed to AArch64, and would signal the end of the merge process. If any of you know of any reason why these two loving backends cannot be merged, speak now or forever hold your peace! J

Times are incredibly approximate and are in man-days/weeks.

· Requirement: No regressions

o Correctness

§ [1w] Regression tests

· All Clang regression tests ported.

· Almost all LLVM regression tests ported – the only thing left is MC-level diagnostics tests. These are in progress (Bradley) – currently 25% of the way through the diagnostics test file.

§ [?] QuIC internal tests

· No further information available, but no public bugs raised.

§ [DONE] ARM internal tests

· All test suites pass.

§ [0d] Apple internal tests

· Tim says we’re “looking reasonable” here J

· This only blocks a “go/no-go”, and there are no actual actions here at the moment (according to Tim)

§ [DONE] LLVM test suite

§ [DONE] MC Hammer

§ [DONE] Emperor

· This is a random test suite so has the possibility to uncover more problems. Our acceptance criterion is 3 days runtime without finding any bugs, which we have now hit.

o Performance

§ No precise fixed performance baseline

§ [DONE] Investigate significant performance regressions – justify fix/not fix.

· No performance blockers reported.

· Requirement: Feature parity

o [DONE] Big endian

§ Big endian support is now complete and all known bugs are fixed upstream. This includes NEON instruction selection.

§ I’m still running testing to validate, but this can be thought of as complete.

o [DONE] Support for no fpu/no neon/ no crc

o [DONE] A53 scheduler

o [DONE] Inline assembly

o [DONE] Predefines

o [DONE] Conditionalise cyclone/Darwin

§ Only the “LDR q” → “LDP d, d” splitting pass to really conditionalise – only benchmarks will really show though.

o [?] ADRP CSE

§ This optimization, being worked on by Jiangning, has been half ported to ARM64. But it hasn’t been committed to AArch64 yet, so it can’t be considered a merge blocker.

§ Jiangning and Quentin are working together on testing and benchmarking this patch.

o [2d] fastcc & guaranteed tail opt

§ Fastcc support (proper tail call optimization) is in progress (Jiangning)

o [2d?] Post-increment NEON ld/st

§ Post-indexed NEON loads and stores are in progress (Hao)



This is fantastic news! Thank you James and thanks to everybody who’s been working on it.

Wow. I don’t know how else to describe it. Just “wow.” This continues to be an amazing process. Thank you to everyone for your work in making this happen.


You guys are making amazing progress!


Hi James,

Thanks for the update report.

Things are looking good on our side. Thanks to Tim who has been quick in fixing issues/reviewing patches.


[ARM64] fatal error: error in backend: Cannot select: 0x7d713c0: f64 = ConstantFP<-0.000000e+00>

[ARM64] Assertion 'hiBitsSet <= numBits && “Too many bits to set!”

[ARM64] Unable to emit UBFX in some cases


ARM64: make sure FastISel uses a GPR64 source in 64-bit extensions.

Pending review:

[ARM64] Miscompile possibly due to incorrect fcsel

[ARM64] CSINC is not generated when there is ZEXT between SETCC and AND

[PATCH] Fix use_iterator in ARM64AddressTypePromotion

Others (unresolved but they are clang related or test issue):