"Remember" an LLVM Instruction across IR serialization?

Hello,

Suppose I have an LLVM IR module M. Through some analysis I obtained an LLVM instruction of interest, say a pointer p to an llvm::Instruction.
If I want to serialize the module on the disk, what is the best way to record p, so that when I load that module from disk later, I can still get the pointer to that specific instruction?

Thanks!

Would you elaborate the ultimate goal of associating an IR instruction with a C++ pointer?

Ask since a C++ pointer points to an address in memory, so it’s not intuitive to me what it means to associate a pointer across serializations (of a IR instruction, or other serializable / de-serializable objects)…

Thanks for your reply! Say through some analysis I located the following instruction (represented as a C++ pointer llvm::Instruction* p):

.....
%1 = add i32 %2, %3  
.....

Of course after I save and reload the IR, the pointer p is invalidated. So how can I find that instruction in the reloaded module (without running the analysis again)? That’s all I want to do.

Thanks for the example! Understood the use case now.

AFAIK typically analysis results could be represented as metadata (LLVM Language Reference Manual — LLVM 16.0.0git documentation) that are attached to IR instructions (and serialized / de-serialized as a part of IR file), and transformation passes could read the attached metadata to know more about the instruction.

This doesn’t solve the use case to instantly get a handle (c++ pointer) to the instruction without another full scan; if getting the handle without a full scan is needed, I supposed some hooks (to expose the in-memory IR) are needed at IR parsing time, but don’t know if such hooks exist. Even if they do, IR might be transformed that eliminates instructions (and invalidates the C++ pointer without a good way of indicating instruction elimination is intended) in a real compilation pipeline, but I can imagine there are valid use cases when saving one full scan is desired…

p.s. My knowledge of LLVM infra is limited, and others might have seen better ways.

1 Like

Thanks for the reply! That makes sense.