Appreciate all of the quick responses to my ridiculous questions so far. Hoping this one attracts similarly good discussion!
Let’s say I have the following series of instructions:
%a = load i32, i32* %ptr1
%b = load i32, i32* %ptr2
%c = add i32 %a, %b
store i32 %c, i32* %ptr3
This gets compiled (roughly) to
mov eax, dword ptr [rsp - 4]
add eax, dword ptr [rsp - 8]
mov dword ptr [rsp - 12], eax
In an opt pass, I would like to replace this series of four instructions with a single intrinsic, @llvm.cache.add, which will represent an add performed in the cache:
call @llvm.cache.add(%ptr3, %ptr1, %ptr2)
This intrinsic will compile (preferably) to a single instruction:
cache_add dword [rsp - 12], dword [rsp - 4], dword [rsp - 8]
When it comes time to generate code for this intrinsic, I am faced with a predicament:
I need to make sure %ptr3, %ptr1, and %ptr2 are allocated in register(s) (rsp in the example above) and,
I need to know which registers they are allocated in, as this will be needed to actually generate the instruction.
Unfortunately, I am stuck on both of these. I have begun researching into the code generation process, but am a little lost in the weeds. I found a page of documentation that seemed important (https://llvm.org/docs/CodeGenerator.html#mapping-virtual-registers-to-physical-registers), however, it’s a little above my current knowledge. Thus, I have some questions:
To solve problem #1, I was planning on simply inserting the
call @llvm.cache.addinstruction after the store instruction in the initial grouping of four instructions. My intent is to ensure that the argument of the store (%ptr3) is defined before its use by
call @llvm.cache.add. Is this safe?
Is the information in problem #2 readily accessible? If I have a value
i32* %ptr1, is there some way to map that to the expression
[rsp - 4]? Presumably, this is exactly what happens during code generation for the load and store instructions – is there an easy way to imitate what they do?
Thanks for all the help so far!