How to insert instructions before each function calls?

I’m trying to insert some instructions before each function calls (before arguments push):

lea %EAX, label ----- new instructions
mov [ESP+stacksize], %EAX ----- new instructions
push arg1
push arg2

push argn
call callee_name

I am a newbie to LLVM. I tried to use buildMI() to insert the instructions in the lowercall() function. But I couldn’t put these instructions in the right positions. Is there a way to locate the position by using MachineBasicBlock iterator?

Any suggestions are appreciated.

Thanks!

Shucai

Take a look at IRBuilder and SetInsertPoint().

So one way might look like this:

IRBuilder<> Builder(&*BB); // BB = Function::iterator OR IRBuilder<> Builder(CallInst->getParent());
Builder.SetInsertPoint(CallInst);
InstructionClass *YourNewInstruction = builder.CreateInstructionClass(…); // InstructionClass = type of instruction you are inserting

-Ryan

So one way might look like this:

IRBuilder<> Builder(&*BB); // BB = Function::iterator OR IRBuilder<> Builder(CallInst->getParent());
Builder.SetInsertPoint(CallInst);
InstructionClass *YourNewInstruction = builder.CreateInstructionClass(…); // InstructionClass = type of instruction you are inserting

I’m not sure how the IRBuilder would work at the MI level, as Shucai was asking.

Can you describe more precisely what are you trying to achieve?
I.e. what are these instructions? Why do you want to do that? It may lead to a different answer.

Mehdi,

Sorry, I misread his original post.

So something like:

XXXInsrtInfo *XII; // target instruction info
MachineBasicBlock::iterator MI = MachineBasicBlock(YourCallInst);
MachineBasicBlock *MBB = YourCallInst->getParent(); // basic block location of your call inst
BuildMI(*MBB, MI, DebugLoc(), XII->get(XXX:::INSTRUCTION)…);

The BuildMI params are going to depend on what you want to do with the instruction being inserted.
http://llvm.org/docs/doxygen/html/MachineInstrBuilder_8h.html

-Ryan

Mehdi,

Sorry, I misread his original post.

So something like:

XXXInsrtInfo *XII; // target instruction info
  MachineBasicBlock::iterator MI = MachineBasicBlock(YourCallInst);
MachineBasicBlock *MBB = YourCallInst->getParent(); // basic block
location of your call inst
BuildMI(*MBB, MI, DebugLoc(), XII->get(XXX:::INSTRUCTION)......);

The BuildMI params are going to depend on what you want to do with the
instruction being inserted.
http://llvm.org/docs/doxygen/html/MachineInstrBuilder_8h.html

-Ryan

Hi Ryan,

    I need to add two instructions for each function call. Do you mean I
should add this snippet in the Lowercall function? Or I should add a new
pass?

Thanks!

So one way might look like this:

IRBuilder<> Builder(&*BB); // BB = Function::iterator OR IRBuilder<>
Builder(CallInst->getParent());
Builder.SetInsertPoint(CallInst);
InstructionClass *YourNewInstruction = builder.CreateInstructionClass(.....);
// InstructionClass = type of instruction you are inserting

I’m not sure how the IRBuilder would work at the MI level, as Shucai was
asking.

Take a look at IRBuilder and SetInsertPoint().

I'm trying to insert some instructions before each function calls
(before arguments push):
   lea %EAX, label ----- new instructions
   mov [ESP+stacksize], %EAX ----- new instructions
   push arg1
   push arg2
   ...
   push argn
   call callee_name

I am a newbie to LLVM. I tried to use buildMI() to insert the
instructions in the lowercall() function. But I couldn't put these
instructions in the right positions. Is there a way to locate the position
by using MachineBasicBlock iterator?

Can you describe more precisely what are you trying to achieve?
I.e. what are these instructions? Why do you want to do that? It may lead
to a different answer.

I'm trying to implement something similar to segmented stack mechanism by

using LLVM. Instead of inserting comparison code in the prologue of the
function, I would like do the probe before arguments pushed. The segmentd
stacks append a guarded page. This guard page will call the addmorestack
function if the probe instructions touch this guarded page. Otherwise, it
only stroe the return address in the bottom of the callee stack frame.

In order to achieve this, for each function call, two instructions are
needed to be inserted:
     LEA %EAX, callee_return_label
     MOV [ESP- callee_stack_frame_size - arguments_size], %EAX
     PUSH argn
     ...
     PUSH arg1
     JMP callee_name
callee_return_label:
     ...

So I need to insert two instructions (LEA and MOV) before each function
call. I don't know when how to insert these two instructions.

Thanks!
Shucai

Personally I would add a new pass that iterates, looks for the call you want then inserts the new instruction.

We do something very similar here for XRay, and I would think the approach would be similar. What XRay does are the following:

- Find the machine instructions in a MachineFunctionPass that look interesting from the XRay perspective. Theses turn out to be: the beginning of the function (not really an instruction but a location), tail calls, and returns. I suspect you can very simply find the call instructions for the platform you're interested in and insert/wrap it in a pseudo instruction.
- When lowering, emit the actual assembly sequence that you want.

For your use-case though I think you may need to hook into function call lowering so you can insert your instruction sequence before stack adjustments are performed (if you want to insert your intercepts before any stack operations as opposed to just before actually calling the function).

Hope this helps.

-- Dean

We do something very similar here for XRay, and I would think the approach
would be similar. What XRay does are the following:

- Find the machine instructions in a MachineFunctionPass that look
interesting from the XRay perspective. Theses turn out to be: the beginning
of the function (not really an instruction but a location), tail calls, and
returns. I suspect you can very simply find the call instructions for the
platform you're interested in and insert/wrap it in a pseudo instruction.
- When lowering, emit the actual assembly sequence that you want.

For your use-case though I think you may need to hook into function call
lowering so you can insert your instruction sequence before stack
adjustments are performed (if you want to insert your intercepts before any
stack operations as opposed to just before actually calling the function).

Hi Dean,

Thank you very much!

For the function call lowering, do you mean lowercall function? I did
insert the instruction before the stack adjustments, but the inserted code
appears in the prologue of the function, other than before the function
call. Maybe I did something wrong with the iterator.

So you mean I should insert a pseudo instruction in the machinefunction
pass, then replace it when function call lowering? (Like segemented stack
implementation?)

Thanks!
Shucai

Yes, inserting pseudo instructions in the MachineFunctionPass -- you might want to have a look at PATCHABLE_RET and how we handle this in XRay. Essentially the idea (which I saw Sanjoy Das do first) is to wrap the actual instruction (in this case, CALL or LEA, or something specific in the platform you're targeting) in a pseudo instruction that just lowers to the correct sequence. This gives you complete control of the actual assembly of the instructions that you're replacing.

-- Dean

2016年9月5日星期一,Dean Michael Berris <dean.berris@gmail.com> 写道:

For the function call lowering, do you mean lowercall function? I did insert the instruction before the stack adjustments, but the inserted code appears in the prologue of the function, other than before the function call. Maybe I did something wrong with the iterator.

So you mean I should insert a pseudo instruction in the machinefunction pass, then replace it when function call lowering? (Like segemented stack implementation?)

Yes, inserting pseudo instructions in the MachineFunctionPass – you might want to have a look at PATCHABLE_RET and how we handle this in XRay. Essentially the idea (which I saw Sanjoy Das do first) is to wrap the actual instruction (in this case, CALL or LEA, or something specific in the platform you’re targeting) in a pseudo instruction that just lowers to the correct sequence. This gives you complete control of the actual assembly of the instructions that you’re replacing.

Hi Dean,

Do you have any example for this? I would like have a look at how you handle this in XRay if it is possible.

Thank you very much!
Shucai

Yes, this is all upstream -- if you look in lib/CodeGen/XRayInstrument.cpp and the associated lowering code for X86 in lib/Target/X86/X86MCInstLower.cpp and/or search for PATCHABLE_RET in include/... and lib/... then that should give you a better idea of how this works. :slight_smile:

Cheers

-- Dean

Hi Dean,

Thank you very much. I solved my problems by using the similar structure.

BTW, I saw there is a hacky way to do the relative jump in function of LowerPATCHABLE_FUNCTION_ENTER. I think directional local symbol can implement it too:

  1. t = getDirectionalLocalSymbol first;
  1. jmp t
  1. then emit nops
  1. createDirectionalLocalSymbol(t)

Hope it can help.

Thank you!
Shucai

Hi Dean,

Thank you very much. I solved my problems by using the similar structure.

Awesome -- happy to help!

BTW, I saw there is a hacky way to do the relative jump in function of LowerPATCHABLE_FUNCTION_ENTER. I think directional local symbol can implement it too:
1. t = getDirectionalLocalSymbol first;
2. jmp t
3. then emit nops
4. createDirectionalLocalSymbol(t)

Hope it can help.

Interesting. Thanks, I'll try that next time in that part of the woods again. :slight_smile:

Cheers

-- Dean