LLVM uses llvm.memcpy intrinsic to initialize local arrays.
For example, consider the C code below which initializes a character array of 2 elements:
int main () {
const char src[2] = "hi";
}
Following is the IR generated for the C code above:
@__const.main.src = private unnamed_addr constant [2 x i8] c"hi", align 1
; Function Attrs: noinline nounwind optnone
define dso_local signext i32 @main() #0 !dbg !10 {
%1 = alloca [2 x i8], align 1
call void @llvm.dbg.declare(metadata [2 x i8]* %1, metadata !15, metadata !DIExpression()), !dbg !21
%2 = bitcast [2 x i8]* %1 to i8*, !dbg !21
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %2, i8* align 1 getelementptr inbounds ([2 x i8], [2 x i8]* @__const.main.src, i32 0, i32 0), i64 2, i1 false), !dbg !21
ret i32 0, !dbg !22
}
It can be seen in the IR that llvm used llvm.memcpy intrinsic to initialize the local array:
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %2, i8* align 1 getelementptr inbounds ([2 x i8], [2 x i8]* @__const.main.src, i32 0, i32 0), i64 2, i1 false), !dbg !21
.
Now consider this second code that contains a call to C’s memcpy standard library function:
#include <stdio.h>
#include <string.h>
int main () {
const char src[2] = "hi";
char dest[2] = "he";
memcpy(dest, src, strlen(src));
}
Following is the IR generated for this second code snippet:
@__const.main.src = private unnamed_addr constant [2 x i8] c"hi", align 1
@__const.main.dest = private unnamed_addr constant [2 x i8] c"he", align 1
; Function Attrs: noinline nounwind optnone
define dso_local signext i32 @main() #0 !dbg !10 {
%1 = alloca [2 x i8], align 1
%2 = alloca [2 x i8], align 1
call void @llvm.dbg.declare(metadata [2 x i8]* %1, metadata !15, metadata !DIExpression()), !dbg !21
%3 = bitcast [2 x i8]* %1 to i8*, !dbg !21
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %3, i8* align 1 getelementptr inbounds ([2 x i8], [2 x i8]* @__const.main.src, i32 0, i32 0), i64 2, i1 false), !dbg !21
call void @llvm.dbg.declare(metadata [2 x i8]* %2, metadata !22, metadata !DIExpression()), !dbg !24
%4 = bitcast [2 x i8]* %2 to i8*, !dbg !24
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %4, i8* align 1 getelementptr inbounds ([2 x i8], [2 x i8]* @__const.main.dest, i32 0, i32 0), i64 2, i1 false), !dbg !24
%5 = getelementptr inbounds [2 x i8], [2 x i8]* %2, i64 0, i64 0, !dbg !25
%6 = getelementptr inbounds [2 x i8], [2 x i8]* %1, i64 0, i64 0, !dbg !25
%7 = getelementptr inbounds [2 x i8], [2 x i8]* %1, i64 0, i64 0, !dbg !26
%8 = call i64 @strlen(i8* %7) #4, !dbg !27
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %5, i8* align 1 %6, i64 %8, i1 false), !dbg !25
ret i32 0, !dbg !28
}
It can be seen in the IR generated for the second example that the call to the C’s standard library memcpy function is also lowered to llvm.memcpy intrinsic in the IR:
call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %5, i8* align 1 %6, i64 %8, i1 false), !dbg !25
I want to write a pass that would identify the calls to the llvm.memcpy intrinsic in the IR that are generated from call to the C’s standard library memcpy function and ignore the other calls to llvm.memcpy intrinsic which are used to initialize local arrays. How can I differentiate between these two types of calls to llvm.memcpy intrinsic?
Kindly help .
Thanks