[LLVM] Addressing Rust optimization failures in LLVM

I think you’re already on the right track here – this transform is pretty much independent of other load transforms, so putting it anywhere inside visitLoadInst() would work. The only caveat is that the function has this early return: https://github.com/llvm/llvm-project/blob/35276f16e5a2cae0dfb49c0fbf874d4d2f177acc/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L1060-L1062 This excludes volatile loads and ordered atomic loads. The transform in question is indeed illegal for volatile loads, but it is legal for atomic loads, so one could add it in front of that check. This also suggests some additional tests that could be added (using volatile/atomic).

The second consideration here is that this transform does not create new instructions. The load will be converted into a constant value. Such transforms are (ideally) performed not in InstCombine, but in InstSimplify. The relevant code would be here: https://github.com/llvm/llvm-project/blob/8f7e7400b74b362251dcf155e45f4097fd574196/llvm/lib/Analysis/InstructionSimplify.cpp#L6572 You can see that this function already contains code to fold a load from a constant at a fixed offset. You would be extending it to also handle loads from non-constant offsets (in specific circumstances).

Architecturally, this is the right place to add the new code. However, it might turn out that this is too expensive (in terms of compile-time) to perform there, in which case it might have to go into InstCombine (or AggressiveInstCombine) after all. But simplifyLoadInst() would be the place to try first.

I would recommend to first only handle the all-zeroes case, because it is much simpler than the general one. Basically, you only need to call ConstantFoldLoadFromUniformValue() on getUnderlyingObject() of the pointer.

Regarding the abstract question – not sure there’s a good answer to that. If you don’t know at all where to place something, I’d try to find something similar to what you want to do that already works, and then find out which pass does the transform using -print-after-all (or on Godbolt, click “Add new” and then “LLVM Opt Pipeline” for a nice presentation) and then -debug to narrow down where in the pass implementation it happens.

1 Like