(no subject)

Hi all,
I have the following C code:

#include<stdio.h>

int main(int argc, char *argv) {

printf(“%s\n”, argv[0]);
return argc;
}

that generates the following IR for the main function:

; Function Attrs: noinline nounwind optnone uwtable
define i32 @main(i32, i8**) #0 {
%3 = alloca i32, align 4
%4 = alloca i32, align 4
%5 = alloca i8**, align 8
store i32 0, i32* %3, align 4
store i32 %0, i32* %4, align 4
store i8** %1, i8*** %5, align 8
%6 = load i8**, i8*** %5, align 8
%7 = getelementptr inbounds i8*, i8** %6, i64 0
%8 = load i8*, i8** %7, align 8
%9 = call i32 (i8*, …) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i8* %8)
%10 = load i32, i32* %4, align 4
ret i32 %10
}

I think %8 will contain the address of argv[0], so %7 was storing a pointer to pointer to argv? I’m not sure of that.

Considering that I can access the GenericValue set by visitGetElementPtrInst. How can I obtain the value of argv[0] at runtime using the Interpreter class?

Thanks

Hi all,
I have the following C code:

#include<stdio.h>

int main(int argc, char *argv) {

    printf("%s\n", argv[0]);
    return argc;
}

that generates the following IR for the main function:

; Function Attrs: noinline nounwind optnone uwtable
define i32 @main(i32, i8**) #0 {
  %3 = alloca i32, align 4
  %4 = alloca i32, align 4
  %5 = alloca i8**, align 8
  store i32 0, i32* %3, align 4
  store i32 %0, i32* %4, align 4
  store i8** %1, i8*** %5, align 8
  %6 = load i8**, i8*** %5, align 8
  %7 = getelementptr inbounds i8*, i8** %6, i64 0
  %8 = load i8*, i8** %7, align 8
  %9 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8], [4
x i8]* @.str, i32 0, i32 0), i8* %8)
  %10 = load i32, i32* %4, align 4
  ret i32 %10
}

You should really run -mem2reg (without the optnone though). Also
-instcombine is afterwards often helpful to clean up clang generated
code, or maybe just run it with -O1.

I think %8 will contain the address of argv[0], so %7 was storing a pointer
to pointer to argv? I'm not sure of that.

Let's see: argc is %0 and argv == &argv[0] is %1. The latter is stored
in %5, which is loaded as %6. The GEP (%7) is a no-op, thus %6 == %7.
Finally, %8 is the value argv[0] of type char/i8.

Considering that I can access the GenericValue set by
visitGetElementPtrInst. How can I obtain the value of argv[0] at runtime
using the Interpreter class?

Sorry can't help with that. I don't know the Interpreter class and I
guess "GenericValue" and "visitGetElementPtrInst" are somehow related to
it.

Hi,
thanks for your answer

Hi all,
I have the following C code:

#include<stdio.h>

int main(int argc, char *argv) {

printf(“%s\n”, argv[0]);
return argc;
}

that generates the following IR for the main function:

; Function Attrs: noinline nounwind optnone uwtable
define i32 @main(i32, i8**) #0 {
%3 = alloca i32, align 4
%4 = alloca i32, align 4
%5 = alloca i8**, align 8
store i32 0, i32* %3, align 4
store i32 %0, i32* %4, align 4
store i8** %1, i8*** %5, align 8
%6 = load i8**, i8*** %5, align 8
%7 = getelementptr inbounds i8*, i8** %6, i64 0
%8 = load i8*, i8** %7, align 8
%9 = call i32 (i8*, …) @printf(i8* getelementptr inbounds ([4 x i8], [4
x i8]* @.str, i32 0, i32 0), i8* %8)
%10 = load i32, i32* %4, align 4
ret i32 %10
}

You should really run -mem2reg (without the optnone though). Also
-instcombine is afterwards often helpful to clean up clang generated
code, or maybe just run it with -O1.

thanks for the suggestion, I understand that the code is not clean but i’d like to understand how it works in general for now :slight_smile:

I think %8 will contain the address of argv[0], so %7 was storing a pointer
to pointer to argv? I’m not sure of that.

Let’s see: argc is %0 and argv == &argv[0] is %1. The latter is stored
in %5, which is loaded as %6. The GEP (%7) is a no-op, thus %6 == %7.
Finally, %8 is the value argv[0] of type char/i8.

why do you consider the GET to be a no-op? %6 is a i8** instead %i7 is an i8*. Thanks

Considering that I can access the GenericValue set by
visitGetElementPtrInst. How can I obtain the value of argv[0] at runtime
using the Interpreter class?

Sorry can’t help with that. I don’t know the Interpreter class and I
guess “GenericValue” and “visitGetElementPtrInst” are somehow related to
it.

No problem, visitGetElementPtrInst is jus that the function that is called when a GEP instruction is found. I tried to get the value calling getOperandValue and basically reproducing the usual LLVM behavior but i always have 0 as value. Maybe a stupid question but when I see the PointerVal of a pointer (e.g. 0x2281a30 ), if I read the bytes as that specific address would I find the value or LLVM does not really store any value in there and the address is just some sort of index within a “emulated” memory?

Thanks


LLVM Developers mailing list
llvm-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Johannes Doerfert
Researcher

Argonne National Laboratory
Lemont, IL 60439, USA

jdoerfert@anl.gov

Thanks

Hi,
thanks for your answer

> > Hi all,
> > I have the following C code:
> >
> > #include<stdio.h>
> >
> > int main(int argc, char *argv) {
> >
> > printf("%s\n", argv[0]);
> > return argc;
> > }
> >
> > that generates the following IR for the main function:
> >
> > ; Function Attrs: noinline nounwind optnone uwtable
> > define i32 @main(i32, i8**) #0 {
> > %3 = alloca i32, align 4
> > %4 = alloca i32, align 4
> > %5 = alloca i8**, align 8
> > store i32 0, i32* %3, align 4
> > store i32 %0, i32* %4, align 4
> > store i8** %1, i8*** %5, align 8
> > %6 = load i8**, i8*** %5, align 8
> > %7 = getelementptr inbounds i8*, i8** %6, i64 0
> > %8 = load i8*, i8** %7, align 8
> > %9 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([4 x i8],
> [4
> > x i8]* @.str, i32 0, i32 0), i8* %8)
> > %10 = load i32, i32* %4, align 4
> > ret i32 %10
> > }
>
> You should really run -mem2reg (without the optnone though). Also
> -instcombine is afterwards often helpful to clean up clang generated
> code, or maybe just run it with -O1.
>
>
thanks for the suggestion, I understand that the code is not clean but i'd
like to understand how it works in general for now :slight_smile:

Understanding the concept can be done either way, one is in my opinion
just unnecessarily complicated.

> > I think %8 will contain the address of argv[0], so %7 was storing a
> pointer
> > to pointer to argv? I'm not sure of that.
>
> Let's see: argc is %0 and argv == &argv[0] is %1. The latter is stored
> in %5, which is loaded as %6. The GEP (%7) is a no-op, thus %6 == %7.
> Finally, %8 is the value argv[0] of type char/i8.
>
>
why do you consider the GET to be a no-op? %6 is a i8** instead %i7 is an
i8*. Thanks

This GEP computes the address of the 0th element of the "array" %6. The
result has the same address as the array and it also has the same type
so I'd say they are indistinguishable.

> > Considering that I can access the GenericValue set by
> > visitGetElementPtrInst. How can I obtain the value of argv[0] at runtime
> > using the Interpreter class?
>
> Sorry can't help with that. I don't know the Interpreter class and I
> guess "GenericValue" and "visitGetElementPtrInst" are somehow related to
> it.
>
>
No problem, visitGetElementPtrInst is jus that the function that is called
when a GEP instruction is found.

It seems the function was appropriately named :wink:

I tried to get the value calling getOperandValue and basically
reproducing the usual LLVM behavior but i always have 0 as value.

What do you think "the usual LLVM behavior is"? Also consider that GEPs
can have multiple operands. What exactly do you want to achieve anyway?

Maybe a stupid question but when I see the PointerVal of a pointer
(e.g. 0x2281a30 ), if I read the bytes as that specific address would
I find the value or LLVM does not really store any value in there and
the address is just some sort of index within a "emulated" memory?

I'm not sure I understand what you are asking but I'm pretty sure you
will not.

Hi,
All good points and jokes in your email :slight_smile:

My final goal is the get the value of %8 at runtime. So considering the program is called argv I would like to execute something like ./argv 22 and get the value of argv[0]. Is it possible? So far I have the address where the parameter at position 0 in store but I’d like to read the value and print it.

I hope it is clear now

Thanks again

I now get your intent but can’t really help you. You need to ask interpreter folks.

I’d suggest you send another email, state clearly what you want, and put a summary in the subject line.

You can also try the IRC, though during the week you might have more luck.

Hi,
No problem at all. Yes I think a good subject would help.

I didn’t know about the IRC channel :slight_smile:

Thanks again, really appreciated

Hi,
I hope this can help someone else, I manually modified the IR in this way


%8 = load i8*, i8** %7, align 8
%9 = load i8, i8* %8, align 4
%10 = call i32 (i8*, …) @printf(i8* getelementptr inbounds ([4 x i8], [4 x i8]* @.str, i32 0, i32 0), i8* %8)
%11 = load i32, i32* %4, align 4
ret i32 %11

At this point it is enough to cast %9: const char *target = (const char *)v.PointerVal;

So it should be enough just to read from that memory address as expected :slight_smile:

Thanks anyway