Compiling load/store IR instructions to named register operations

Say I am compiling a function in which a certain pointer argument p is known to point to a contiguous region of memory, call it 512 bytes wide. Moreover, my (RISC-like) hardware is guaranteed to have those bytes loaded into a special set of 8-byte registers, call them r1, …, r64. In the IR, p is treated as a pointer and dereferenced using GEP and load/store instructions. However, suppose that the author of the function is required to only ever reference offsets within p which are known at compile time, so that, for every IR load/store to some pointer derived from p, we can calculate using predecessor GEP instructions the exact form of the pointer as p + C for some known constant C. The goal is to transform every load/store on this memory region into a register operation on named registers, hence the limitation that p can never be dereferenced using an offset known only at run-time. Assuming C is 8-byte aligned, a load from p + C should compile to a register move from rX into some temporary destination register rd, where X=C/8. (Perhaps the use of temporary registers could be eliminated, and r1-r64 used directly).

My first impression is that this design is not in-line with LLVM’s memory model, but I was hoping someone could provide advice on how to approach it using LLVM’s tools, or otherwise. I saw the warning at llvm.org/docs/ExtendingLLVM.html about checking in the forums before seriously considering adding even just a new intrinsic function, and so decided to make this post.

My first thought is to define a new intrinsic function: First, replace each static load/store to p with an intrinsic informing the backend which exact offset from p to use. Then, delete the unused GEP instructions preceding the deleted load/store.

Any advice or feedback would be greatly appreciated, as I am fresh out of a compilers course in which we only used a simplified LLVM IR and wrote all of our own basic infrastructure ourselves, never using LLVM’s.

Despite the warning, intrinsics are a fact of life for dealing with target-specific constructs that can’t be represented some other way. That seems fine. (See also documentation for “immarg”).

Generating GEPs with a constant offset, then trying to remove them later, is dubious; LLVM optimizations are allowed to turn a constant GEP into a non-constant GEP. (Usually, it won’t because it isn’t profitable, but occasionally it will, and untangling it would be complicated.)

Thanks. I was hoping to use an existing frontend (such as clang++), but what you are suggesting is to modify the frontend so as not to generate GEPs into p in the first place?

Can clang turn a constant GEP into a non-constant when optimizations are turned off? If not, then perhaps I could generate IR with a vanilla clang frontend at -O0, do the intrinsic substitution and GEP deletion with the assumption that any non-constant GEP is an illegal operation written by the function author and not the product of an LLVM optimization, and then run the resulting IR at -O3 levels safely. No optimized GEP untangling necessary.

edit: one downside to starting with O0 is that the later optimization passes would have to work around a bunch of intrinsics inserted without any optimization. Perhaps there are specific optimizations which can be turned off to preserve constant GEPs, without having to go to O0 entirely?

what you are suggesting is to modify the frontend so as not to generate GEPs into p in the first place?

If you were proposing to submit this to LLVM, I’d insist on that approach. (Making the frontend do it means the IR semantics are clearer, and the frontend can generate better diagnostics if the user writes code that’s impossible to lower.)

For a research project, a more hacky approach might be sufficient, of course.

Can clang turn a constant GEP into a non-constant when optimizations are turned off?

clang depends on LLVM IR passes for that sort of optimization, so IR generated with “-mllvm -disable-llvm-optzns” should have the expected GEPs.

1 Like