Finding which registers the operand of a load maps to

​Appreciate all of the quick responses to my ridiculous questions so far. Hoping this one attracts similarly good dis​cussion!

Let’s say I have the following series of instructions:

%a = load i32, i32* %ptr1
%b = load i32, i32* %ptr2
%c = add i32 %a, %b
store i32 %c, i32* %ptr3

This gets compiled (roughly) to

mov eax, dword ptr [rsp - 4]
add eax, dword ptr [rsp - 8]
mov dword ptr [rsp - 12], eax

In an opt pass, I would like to replace this series of four instructions with a single intrinsic, @llvm.cache.add, which will represent an add performed in the cache:

call @llvm.cache.add(%ptr3, %ptr1, %ptr2)

This intrinsic will compile (preferably) to a single instruction:

cache_add dword [rsp - 12], dword [rsp - 4], dword [rsp - 8]

When it comes time to generate code for this intrinsic, I am faced with a predicament:

  1. I need to make sure %ptr3, %ptr1, and %ptr2 are allocated in register(s) (rsp in the example above) and,

  2. I need to know which registers they are allocated in, as this will be needed to actually generate the instruction.
    Unfortunately, I am stuck on both of these. I have begun researching into the code generation process, but am a little lost in the weeds. I found a page of documentation that seemed important (, however, it’s a little above my current knowledge. Thus, I have some questions:

  3. To solve problem #1, I was planning on simply inserting the call @llvm.cache.add instruction after the store instruction in the initial grouping of four instructions. My intent is to ensure that the argument of the store (%ptr3) is defined before its use by call @llvm.cache.add. Is this safe?

  4. Is the information in problem #2 readily accessible? If I have a value i32* %ptr1, is there some way to map that to the expression [rsp - 4]? Presumably, this is exactly what happens during code generation for the load and store instructions – is there an easy way to imitate what they do?
    Thanks for all the help so far!

Gus, PSU