mixing static/dynamic code

Hi,
I have the following code, the lines preceded by > being added at runtime (the snipped was also printed at runtime)

define i32 @myfunc(i32 %pi) {
entry:
%pi_addr = alloca i32 ; <i32*> [#uses=3]
%retval = alloca i32 ; <i32*> [#uses=2]
%tmp = alloca i32 ; <i32*> [#uses=2]

%ptr32 = alloca i32 ; <i32*> [#uses=2]
%“alloca point” = bitcast i32 0 to i32 ; [#uses=0]

store i32 %pi, i32* %ptr32
%ptr8 = bitcast i32* %ptr32 to i8* ; <i8*> [#uses=1]
call void @roc( i8* %ptr8 )

store i32 %pi, i32* %pi_addr
%pi_addr1 = bitcast i32* %pi_addr to i8* ; <i8*> [#uses=1]
call void @roc( i8* %pi_addr1 )

void roc(void*) is a function that only prints its argument as an integer.

I was expecting for the program to print the same thing twice (the address of the myfunc argument) because the code added at runtime looks to me identical with the code that was compiled (3 lines each).
However, that was not the case; what I got was:

147156320
147117168

What am I missing?
Thanks,
Paul

Paul Martin wrote:

Hi,
I have the following code, the lines preceded by `>` being added at runtime (the snipped was also printed at runtime)

define i32 @myfunc(i32 %pi) {
entry:
        %pi_addr = alloca i32 ; <i32*> [#uses=3]
        %retval = alloca i32 ; <i32*> [#uses=2]
        %tmp = alloca i32 ; <i32*> [#uses=2]
> %ptr32 = alloca i32 ; <i32*> [#uses=2]
        %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]

> store i32 %pi, i32* %ptr32
> %ptr8 = bitcast i32* %ptr32 to i8* ; <i8*> [#uses=1]
> call void @roc( i8* %ptr8 )

        store i32 %pi, i32* %pi_addr
        %pi_addr1 = bitcast i32* %pi_addr to i8* ; <i8*> [#uses=1]
        call void @roc( i8* %pi_addr1 )
...

void roc(void*) is a function that only prints its argument as an integer.

So it's printing the pointer.

I was expecting for the program to print the same thing twice (the address of the myfunc argument) because the code added at runtime looks to me identical with the code that was compiled (3 lines each).
However, that was not the case; what I got was:

147156320
147117168

What am I missing?

Make roc() dereference the pointers it's given and print that instead.

Nick

Nick,
Thanks for the quick answer.
Dereferencing the pointer does yield the same result in both cases but that’s not what I want to do. I want to instrument the program dynamically and keep track of certain memory locations which is a problem if the same variable has different addresses for the static/dynamic code - as far I see this is what it’s happening but I have no clue why.

Paul

Paul Martin wrote:

Nick,
Thanks for the quick answer.
Dereferencing the pointer does yield the same result in both cases but that's not what I want to do. I want to instrument the program dynamically and keep track of certain memory locations which is a problem if the same variable has different addresses for the static/dynamic code - as far I see this is what it's happening but I have no clue why.

If you look at the LLVM IR you can see two distinct alloca instructions, the one your static code had named "%pi_addr" and the one you created named "%ptr32".

It sounds like what you want to do is instead of creating your own stack spot to store %pi into, you should look at the first user of %pi which should be a store of %pi into its stack space, as generated by the static compilation, and pull the stack slot out of there. Something like this:

   (Given Argument *A to look for:)
   Value *StackSlot = 0;
   if (StoreInst *SI = dyn_cast<StoreInst>(*A->use_begin()))
     StackSlot = SI->getPointerOperand();

though I haven't actually tried to compile or run that.

Nick

Nick,
Your solution works well if there is a store instruction in the function but in the case where there is none (i.e. the argument is passed on directly to another function), creating a store does not help to get the memory address of the variable which takes me back to my initial question at http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-May/022590.html - isn’t there a way to obtain the address of an argument? Because doing a store seems to copy the variable and expose the address of the copy - http://www.llvm.org/docs/LangRef.html#i_store is not very clear on what’s going on.

You can imagine that if you try to do something like taint analysis you want to track each access to a memory location - therefore something like the above would be needed.
Any ideas?

Thanks,
Paul

Paul Martin wrote:

Nick,
Your solution works well if there is a store instruction in the function but in the case where there is none (i.e. the argument is passed on directly to another function), creating a store does not help to get the memory address of the variable which takes me back to my initial question at http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-May/022590.html - isn't there a way to obtain the address of an argument?

No. Arguments are registers and therefore have no addresses.

It may so happen that due to ABI requirements that the arguments really will have addresses on the target machine, but for LLVM purposes we always model arguments as if they were going through registers. It's one of the benefits of having an infinite register machine.

  Because doing a store

seems to copy the variable and expose the address of the copy - LLVM Language Reference Manual — LLVM 16.0.0git documentation is not very clear on what's going on.

You can imagine that if you try to do something like taint analysis you want to track each access to a memory location - therefore something like the above would be needed.
Any ideas?

Design your taint tracking to not require the addresses of arguments. What would you do if arguments were passed through registers like EAX on the target system anyhow? Surely you must already have a way to handle this?

Alternately you could write an LLVM pass to rewrite the whole program such that all arguments are passed by pointer, and provide some trusted surface area to fix things up before calling external libraries which expect non-pointer arguments.

Nick

Hi Paul,

I have the following code, the lines preceded by `>` being added at runtime (the snipped was also printed at runtime)

define i32 @myfunc(i32 %pi) {
entry:
        %pi_addr = alloca i32 ; <i32*> [#uses=3]
        %retval = alloca i32 ; <i32*> [#uses=2]
        %tmp = alloca i32 ; <i32*> [#uses=2]
> %ptr32 = alloca i32 ; <i32*> [#uses=2]
        %"alloca point" = bitcast i32 0 to i32 ; <i32> [#uses=0]

> store i32 %pi, i32* %ptr32
> %ptr8 = bitcast i32* %ptr32 to i8* ; <i8*> [#uses=1]
> call void @roc( i8* %ptr8 )

        store i32 %pi, i32* %pi_addr
        %pi_addr1 = bitcast i32* %pi_addr to i8* ; <i8*> [#uses=1]
        call void @roc( i8* %pi_addr1 )
...

void roc(void*) is a function that only prints its argument as an integer.

I was expecting for the program to print the same thing twice (the address of the myfunc argument) because the code added at runtime looks to me identical with the code that was compiled (3 lines each).
However, that was not the case; what I got was:

%pi_addr and %ptr32 are two different automatic variables. You are
printing their addresses and these are different. Maybe you meant
to print the contents of the variables?

147156320
147117168

Ciao,

Duncan.