Proper way of using memcpy to the instrumented malloc space (.ll IR code provided)

Hello, LLVM developers; I have this straightforward C code:

int test_global = 5;

#define BUF_LEN      8
void foo(char *b) {
    int test = 0;
    *(b+2) = '1';
    while (test_global--) {
        printf("%c", *(b+test));
        test++;
    }
    printf(" world!\n");
}

char a[BUF_LEN] = {'h','e','l','l','o'};
int main()
{
    printf("%c%c%c%c%c world!\n", a[0],a[1],a[2],a[3],a[4]);
    foo(a);
    return 0;
}

This will generate the following .ll file (for the sake of brevity, I omitted unnecessary IR codes):

define dso_local void @foo(i8* %0) #0 {
  %2 = alloca i8*, align 8
  store i8* %0, i8** %2, align 8
  %4 = load i8*, i8** %2, align 8
  %5 = getelementptr inbounds i8, i8* %4, i64 2
  store i8 49, i8* %5, align 1
  br label %6

So my goal is to copy the content of the static array to a heap location upon entering the foo function using LLVM instrumentation.

I was able to successfully create a pass that outputs the following .Il file:

define dso_local void @foo(i8* %0) #0 {
  %2 = alloca i8*, align 8
  store i8* %0, i8** %2, align 8
  %3 = load i8*, i8** %2, align 8
  %4 = alloca i8*, align 8
  %5 = tail call i8* @malloc(i8 mul (i8 ptrtoint (i8** getelementptr (i8*, i8** null, i32 1) to i8), i8 8))
  store i8* %5, i8** %4, align 8
  %7 = getelementptr i8, i8* %3, i8 0
  %8 = bitcast i8** %4 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %8, i8* %7, i64 8, i1 false)
  %9 = load i8*, i8** %4, align 8
  %10 = getelementptr inbounds i8, i8* %9, i64 2
  store i8 49, i8* %10, align 1
  br label %11

To me, I think the way I instrumented makes sense.

  1. I am storing the array object to %2
  2. I am creating malloc with proper size to store the array object
  3. I am creating memcpy from the beginning of the address of where array is and then copying it to the malloc’ed heap location.

However, I am getting the segmentation fault at the line call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 8 %8, i8* %7, i64 8, i1 false).

I’m confused at the moment whether I am using the memcpy instrumentation wrong or missing critical instrumentation that is causing such an error.

Could anyone let me know what I am missing?

Or am I doing this completely wrong? Are there any alternative ideas to achieve this?

Thank you very much in advance,
Kind regards,

So I think I might have found an issue, which is the following line of code:

%9 = load i8*, i8** %4, align 8

I think the register that is being loaded into should be %8. However, the problem is that if I were to use %8, this will be

%9 = load i8*, i8* %8, align 8

which will result in an error:

error: explicit pointee type doesn't match operand's pointee type (i8* vs i8)
%9 = load i8*, i8* %8, align 8     

Does this mean I have to create a BitCast instruction which is of the type i8**? I’m not too sure where to go from here.

I would appreciate any suggestions.
Sincerely,

Hello again, so I did some more searching and happened to find this link: Tagebuch eines Interplanetaren Botschafters: Can memcpy be implemented in LLVM IR? (nhaehnle.blogspot.com)

Which showed how memcpy should work like? For example like this:

define i32 @sample(i32** %pp) {
  %tmp = alloca i32*
  %pp.8 = bitcast i32** %pp to i8*
  %tmp.8 = bitcast i32** %tmp to i8*
  call void @memcpy(i8* %tmp.8, i8* %pp.8, i64 8)
  %p = load i32*, i32** %tmp
  %x = load i32, i32* %p
  ret i32 %x
}

Therefore, I tried to mimic what the author of that link posted for my analysis and was able to obtain something along the line of:

define dso_local void @foo(i8* %0) #0 {
  %2 = alloca i8*, align 8
  %4 = bitcast i8* %0 to [8 x i8]*
  %5 = tail call i8* @malloc(i8 mul (i8 ptrtoint (i8** getelementptr (i8*, i8** null, i32 1) to i8), i8 8))
  store i8* %5, i8** %2, align 8
  %6 = bitcast i8** %2 to i8*
  %7 = bitcast [8 x i8]* %4 to i8*
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %6, i8* %7, i64 8, i1 false)
  %8 = load i8*, i8** %2, align 8
  %9 = getelementptr inbounds i8, i8* %8, i64 2
  store i8 49, i8* %9, align 1

However, still getting the segmentation fault.

What could be an issue here? LLVM IR code doesn’t look wrong, and I think I understand what is happening with each line of code.

I would appreciate any insights.
Kind regards,

Alright, I have finally figured it out.

So to post my IR code, it should look something like this!

define dso_local void @foo(i8* %0) #0 {
  %2 = alloca i8*, align 8
  store i8* %0, i8** %2, align 8
  %3 = load i8*, i8** %2, align 8
  %4 = alloca i8*, align 8
  %5 = tail call i8* @malloc(i64 8)
  store i8* %5, i8** %4, align 8
  %6 = load i8*, i8** %4, align 8
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %6, i8* %3, i64 8, i1 false)

I made the following major mistakes it seems:

  1. malloc wasn’t done correctly, I was mallocing the size with the wrong type (should have been i64)
  2. I seem to have overcomplicated myself with memcpy? To do memcpy, it seems like I didn’t need to create any GEP instructions; at least how I have done it seems to have worked fine (where I created new alloca, store, and load for the %0 argument value, then load it for memcpy later)

Anyhow, I’m quite content with how it is working, so I hope if anyone else needs to use memcpy, this post can be somewhat useful.

Here is the updated example (from the original post) that I tested with:

#define BUF_LEN      8
void foo(char *b) {
    int cnt = 1;
    *(b+2) = 'a';
    printf("%s ", b);
    while (cnt--){
        printf("world! Address: %p\n",b);
    }
}

char a[BUF_LEN] = {'h','e','l','l','o'};
int main()
{
    char *test = malloc(sizeof(char)*BUF_LEN);
    printf("%c%c%c%c%c world! Address: %p Heap: %p\n", a[0],a[1],a[2],a[3],a[4], a, test);
    foo(a);
    return 0;
}

and here is how it outputs:

~ ➤ ./test_program.out                                                      
hello world! Address: 0x41206c Heap: 0x217cd2a0
healo world! Address: 0x217cd6d0