[Sinking] Does any LLVM pass currently handle load sinking for invariant loads?

danilaml · June 17, 2024, 3:16pm

I’ve tried to search invariant.load and invariant.start/end uses but it seems to be mostly in “ignore”/propagate style helper functions.

It looks like if there is any call that’s not readonly on the path (in the simplest case), there is no sinking for this example, even though it seems to me it should be allowed since the load is marked as invariant.

declare void @foo() nounwind
declare void @bar(i32) nounwind
declare void @baz() nounwind

declare ptr @llvm.invariant.start.p0(i64, ptr nocapture) nounwind readonly
declare void @llvm.invariant.end.p0(ptr, i64, ptr nocapture) nounwind

define void @test1(ptr %p, i1 %f) local_unnamed_addr #0 {
entry:
    %val = load i32, ptr %p;
    br i1 %f, label %exitbb1, label %exitbb2
exitbb1:
    call void @foo()
    ret void
exitbb2:
    call void @bar(i32 %val)
    ret void
}

define void @test2(ptr %p, i1 %f) local_unnamed_addr #0 {
entry:
    %val = load i32, ptr %p, !invariant.load !0
    call void @baz()
    br i1 %f, label %exitbb1, label %exitbb2
exitbb1:
    call void @foo()
    ret void
exitbb2:
    call void @bar(i32 %val)
    ret void
}

define void @test3(ptr %old_p, i1 %f) local_unnamed_addr #0 {
entry:
    %p = call ptr @llvm.invariant.start.p0(i64 4, ptr %old_p)
    %val = load i32, ptr %p
    call void @baz()
    br i1 %f, label %exitbb1, label %exitbb2
exitbb1:
    call void @foo()
    ret void
exitbb2:
    call void @bar(i32 %val)
    call void @llvm.invariant.end.p0(ptr %p, i64 4, ptr %old_p)
    ret void
}

!0 = !{}

danilaml · June 24, 2024, 5:25pm

Ping. I know that there are SinkingPass (Sink.cpp) and GVNSink which don’t appear to be used anywhere (in GVNSink it’s disabled by default, IIRC due to soundness issues), but they don’t optimize the examples above either.

artagnon · October 17, 2024, 9:21am

I see a test/Transforms/Sink/invariant-load.ll, which has a test case, although I’m not entirely sure how it’s different from your test cases. A quick find-references pulls up the AMDGPU backend, and the sink pass seems to be used there. Perhaps cc @nikic and @arsenm for more information?

nikic · October 17, 2024, 9:48am

For “sinking”, there’s two cases to distinguish:

Sinking common instructions in predecessors into the successors. In the default pipeline this is done by SimplifyCFG. Outside the default pipeline GVNSink does this. This is not the case you are interested in.
Sinking instructions to their use in a successor block. In the default pipeline this is done by InstCombine. Outside the default pipeline Sink does this.

The short-term solution to sinking problems is to extend sinking in InstCombine – but there is a sharp limit to the amount of extension we’re going to accept.

For example, adding support for !invariant.load should be easy and not cause any additional overhead, you basically just need an extra check here: llvm-project/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp at 9b713f5d234adec266d46c9cfc3f2607793976dc · llvm/llvm-project · GitHub Feel free to implement that.

Longer term, we should consider enabling the Sink pass or something like it. I’m not familiar with the history here of why we ended up with sinking in InstCombine. If more sophisticated analysis (like AA) is needed to perform sinking, a separate pass is a better bet.

Regarding invariant loads, we have three mechanisms to encode these.

!invariant.load is a hard global requirement, and very easy to support.
invariant.start/invariant.end are architecturally very hard to support. It’s not really possible to efficiently query these. We have little to no optimizations for them – I think the only thing these are used for is an open-ended invariant.start when constructing global constants. I’d consider phasing these out (at least the invariant.end part).
!invariant.group and related intrinsics. I believe this is the preferred mechanism for non-global invariance. But I think the design is more targeted at CSE, I’m not sure it’s really suitable for sinking purposes.

danilaml · October 17, 2024, 3:52pm

It’s true that it shouldn’t be hard to support !invariant.load, however for my use case I needed a finer grained solution. I need to sink values that are “mostly invariant”, i.e. they can’t be changed by IR directly but certain calls and intrinsics can “clobber” them. Since I didn’t find a way to encode this property in LLVM IR that was also supported by existing passes (as I’ve mentioned in the first post - load sinking appears to be rather basic and either disabled or doesn’t really take advantage of the invariant properties) I ended up going with a custom downstream pass for these specific purposes plus adjustments to AA. Would be nice if it could be encoded upstream. I’d imagine something like C++ coroutines might take advantage of such info for TLS variables that behave similarly.

Topic		Replies	Views
Status of llvm.invariant.{start\|end} LLVM Dev List Archives	6	137	November 1, 2017
LLVM doesn't appear to use invariant.start/invariant.end to remove dead stores LLVM Dev List Archives	0	70	May 14, 2021
invariant.load metadata semantics LLVM Dev List Archives	34	205	August 31, 2016
Dereferenceable load semantics & LICM LLVM Dev List Archives	21	122	April 7, 2017
Why llvm.invariant.end intrinsic would prevent optimizations? IR & Optimizations llvm	3	87	November 8, 2024

[Sinking] Does any LLVM pass currently handle load sinking for invariant loads?

Related topics