Is there a way to get instruction’s offset at compile time with llvm for ARM?
I am trying to create a map between instructions at compile time and this run-time info. Since PC is a relative value, I am trying to use the instruction’s offset as a constant property of instruction to create this map. I think offset information should be available to create the executable, if so where to find it?
Thank you for your help,
I don’t know if my question was super easy or maybe it is not clear. I’m still in need of the answer and appreciate any help.When LLVM creates assembly code it should somehow create instructions’ program counter (offset) right? how to get this information?
Debug metadata is generally the only way to link a byte of machine code back to the original source location. So that tools which understand the source language can help you see how your program runs. But even that is inaccurate.
There is no other metadata linking assembly instructions back to any Value* in the intermediate representation. The goal of every optimisation is to make the whole program faster, not to keep track of how we got here.
There is often a huge difference between how you imagine something works, and how it actually works.
If you want a better answer, try to explain what you want to do and why. We aren’t mind readers, and we can’t piece together what you want from a description of how you imagine something works.
Your question isn't really clear what kind of offset you need, or exactly you're planning to do with the offset.
In assembly, if you have two labels in the same section, you can write the offset between them using subtraction, e.g. ".L3 - .L1". The assembler will resolve this to an actual number.
LLVM creates something like assembly language (or even real assembly language) and doesn’t know the exact offset of an instruction within the function (or section).
The assembler might create different sized instructions – or even multiple instructions – from something that LLVM just thinks of as “an instruction”. For example because of different instruction encodings or even instruction sequences being needed because of the size of immediate values or branch offsets (often not even known until link time). Even things such as whether or not a REX prefix is needed on AMD64 depends on whether some register mentioned in the instruction is in the high 8 registers – LLVM could pay attention to that, but I’d suspect it doesn’t.
On some platforms, instructions within a function can even be added or deleted by the linker.
Your best best may be to take the final linked program and run objdump on it.
Objdump was exactly what I needed!
Thanks everyone. Sorry If I my question was not clear enough.