Credit note: multiple persons form the SPIR-V backend meeting provided input to write this RFC such as @michalpaszkowski @MrSidims
Motivation
While ptradd
migration simplifies pointer arithmetic and enables optimizations by explicitly materializing offset calculations, it discards the high-level, type-based path information inherent to the GEP instruction. For targets like SPIR-V, this causes 2 challenges:
- SPIR-V requires a structured path to generate instructions like
OpAccessChain
; this information is critical and lost withptradd
. - The memory layout of some objects is unknown at compile time, meaning using
ptradd
/a byte offset is not viable.
Without the context of how an address is derived through nested structures and arrays, the backend can no longer reliably reconstruct the frontend’s original intent, posing a major correctness and functionality problem.
For a more detailed description of the issue, see this comment, but the root is that some linearized access expressions cannot be lifted back to a structured access as multiple, but incompatible, valid solutions exist.
The only robust way to solve this problem is to get additional information from the frontend. At a high-level, there are 2 main ways we could add this information:
- replacing the
ptradd
/gep
instruction with an instruction carrying structure information. - adding sideband information (metadata, interleaved instructions, …)
Proposed solution
We propose that Clang emits BPF’s llvm.preserve.*.access.index
instructions instead of ptradd/GEP when targeting a backend requiring structured access (such as SPIR-V). As of today, only the HLSL frontend would require this change.
In order to represent array access with dynamic indices, we would require relaxing the llvm.preserve.array.access.index
instruction to accept non-constant indices.
With those 2 changes, we should only get structured access into nested types, and would be able to generate valid OpAccessChain from LLVM-IR.
The main drawback is the lack of optimizations: since GEP are not used, we won’t benefit from existing GEP optimizations. This is not only an issue for performance, but also for legalization: for example, removing local copies of non-tangible types.
Most passes like inlining, DCE or sinking should not be impacted by those, but we would require adding support for those intrinsics to at least the SROA pass.
Alternatives
Add new target-specific intrinsics interleaved with GEP instructions
%ptr = getelementptr i8 %base_ptr, i32 %offset
%tmp = spv.type.access(%ptr, %base_ptr, %TheType poison, i32 index)
%val = load i32, ptr %tmp
This would be similar to the proposed solution, except we’d use a target-specific intrinsic.
By not reusing the existing instructions, we’ll have to extensively modify Clang’s codegen to emit those intrinsics in place of GEP instructions when targeting SPIR-V. I see no benefit, except if the community wants us to stay away from the existing instructions.
Annotate GEP instructions with metadata.
- This is a no-go.
- passes are allowed to strip metadata they don’t know about.
- passes rewriting GEP would not have to know about those to properly merge/update the information.
Add a new target-specific intrinsic, interleaved but without return value.
%ptr = getelementptr i8 %base_ptr, i32 %offset
spv.type.access(%ptr, %base_ptr, %TheType poison, i32 index)
%val = load i32, ptr %ptr
I think this has the same issues as the metadata: the passes will have no knowledge of the semantic carried by this intrinsic, and could replace the GEP used by the load without updating the annotation.