Getting basic block address offset from its parent function

Hi, all

Is there a way of getting the basic block offset from its parent function ?

What I’m trying to do is to get an execution count of each basic blocks, so I need to know the starting address of each basic blocks. Obviously we can’t get the absolute address before linking the program, but the offset relative to parent function should be available so I can take it and get the function start address from objdump then figure out each basic block’s absolute address.

Or is there another way of doing this …

Any suggestion is very much appreciated
Patrick

At the LLVM IR level, no. At the code generator layer (MachineFunctionPass layer or the MC layer), probably yes. On way to do it would be to instrument the program so that each basic block increments a counter every time it is executed. This would be trivial to do. To optimize it, you could analyze the CFG so that control equivalent basic blocks use a single counter (e.g., the single-entry block and the single-exit block are executed the same number of times, so they only need 1 counter). Another option might be to use the pcmarker intrinsic. Apparently it’s used for matching up LLVM IR to machine instructions for use in processor simulators, though I have never used it myself. The Giri project has support for recording the execution of every basic block, but it might be more heavy-weight than you need. I also don’t recall off-hand from where to download it; search the llvmdev archives for emails from Swarup Sahoo to get that information. Hope this helps, John Criswell

Hi John

Thanks for your suggestions, they all sound reasonable to me. The way I’m thinking right now is to write a MachineFuncionPass that iterate through each MachinBasicBlock, for each MBB, adds up the instructions counts of previous MBBs, that number multiply by 4 should be the offset of that MBB from its MachineFunction. In order to correctly count the instructions, this pass should be inserted after the last transform pass …

Does this sound reasonable ?

Thanks,
Patrick

Hi John

Thanks for your suggestions, they all sound reasonable to me. The way I'm
thinking right now is to write a MachineFuncionPass that iterate through
each MachinBasicBlock, for each MBB, adds up the instructions counts of
previous MBBs, that number multiply by 4 should be the offset of that MBB
from its MachineFunction. In order to correctly count the instructions,
this pass should be inserted after the last transform pass ..

Does this sound reasonable ?

I could be wrong, but I'm not sure that's possible - my understanding was
that the particular length of a sequence could depend on assembler-level
choices of instruction encoding & the like. I believe the right/only way to
do this is with label differences that the assembler will resolve/compute
for you. But I could quite well be wrong - it's certainly not my area of
expertise.

- David

It depends on the specific target. What’s being described is basically what the ARM constant islands pass does. With the rather large caveat that there will always be some conservative assumptions baked in (inline assembly is “awesome”).

-Jim

Hi John

Thanks for your suggestions, they all sound reasonable to me. The way I'm
thinking right now is to write a MachineFuncionPass that iterate through
each MachinBasicBlock, for each MBB, adds up the instructions counts of
previous MBBs, that number multiply by 4 should be the offset of that MBB
from its MachineFunction. In order to correctly count the instructions,
this pass should be inserted after the last transform pass ..

Does this sound reasonable ?

I could be wrong, but I'm not sure that's possible - my understanding was
that the particular length of a sequence could depend on assembler-level
choices of instruction encoding & the like. I believe the right/only way to
do this is with label differences that the assembler will resolve/compute
for you. But I could quite well be wrong - it's certainly not my area of
expertise.

It depends on the specific target. What’s being described is basically
what the ARM constant islands pass does. With the rather large caveat that
there will always be some conservative assumptions baked in (inline
assembly is “awesome”).

Ah, fair enough - I was vaguely wondering how the constant island pass
dealt with it. I guess there's less encoding length ambiguity in a RISC
architecture like ARM which makes this a bit more viable.

It depends on the specific target. What’s being described is basically what the ARM constant islands pass does. With the rather large caveat that there will always be some conservative assumptions baked in (inline assembly is “awesome”).

Ah, fair enough - I was vaguely wondering how the constant island pass dealt with it. I guess there’s less encoding length ambiguity in a RISC architecture like ARM which makes this a bit more viable.

Yep, exactly. Though the remaining small amounts of ambiguity are sufficient to make that pass more than a little tricky (understatement of the year). When at all possible, it’s far better to leave such things to the MC layer to figure out. It’s only when there really need to be significant optimizations done which depend on the layout that it’s worth the hassle.

-Jim