[RFC] Assembly Super Optimiser

hanno-becker · June 15, 2023, 4:21am

Hey. As one of the authors of [1], I’m obviously excited to see this. As a TL;DR, we trialled this for various DSP and crypto algorithms on multiple in-order/OOO Arm cores (M55,M85,A55,A72), and saw substantial improvements even compared to prior handwritten assembly. Our tool (GitHub - slothy-optimizer/slothy: Assembly super optimization via constraint solving) uses CP-SAT instead of Z3, and ideally one would keep that flexible.

As I see it, this would be independent of and perhaps even combinable with souper (GitHub - google/souper: A superoptimizer for LLVM IR): The latter does peephole optimization on LLVM IR, while the above is (largely-)instruction-preserving scheduling/allocation/SW-pipelining optimization at the assembly level.

An essential input here is of course a reasonably detailed uArch model, so it would also be interesting to explore if/how this RFC could work with @reidtatge’s [RFC] MDL: A Micro-Architecture Description Language for LLVM. [1] encodes things in a very ad-hoc way so far (e.g. slothy/targets/aarch64/cortex_a55.py at main · slothy-optimizer/slothy · GitHub for A55)

Looking forward to seeing where this goes.

Topic		Replies	Views
project idea: llvm superoptimizer LLVM Dev List Archives	1	84	December 16, 2009
Request for comments on optimizing assembler LLVM Dev List Archives	0	86	May 29, 2017
llvm-dev Digest, Vol 154, Issue 83 LLVM Dev List Archives	0	110	April 25, 2017
Testing Target Optimization via ASM Injection LLVM Dev List Archives	3	65	December 19, 2012
RFC: a practical mechanism for applying Machine Learning for optimization policies in LLVM LLVM Dev List Archives	14	327	April 12, 2020

[RFC] Assembly Super Optimiser

Related topics