llvm cse optimization

I’m working on the across the subprogram call optimization, and let’s say the following llvm IR,

@ng0 = internal global [2 x i32] [i32 2, i32 0], align 4

%t7 = alloca [16 x i8], align 16
%add.ptr = getelementptr inbounds i8* %t1, i64 40
%call = call i8* @handle_value(i8* %add.ptr, i32 3) #3
%arraydecay = getelementptr inbounds [8 x i8]* %t3, i64 0, i64 0
%arraydecay1 = getelementptr inbounds [16 x i8]* %t4, i64 0, i64 0
call void @select_value(i8* %arraydecay, i32 1, i32 1, i8* %call, i8* %arraydecay1, i8* null, i32 1, i32 2, i32 1, i8* bitcast ([2 x i32]* @ng0 to i8*), i32 32, i32 1) #3
%call3 = call i8* @handle_value(i8* %add.ptr, i32 3) #3

; Function Attrs: nounwind readonly
declare i8* @handle_value(i8* nocapture, i32) #1
; Function Attrs: nounwind
declare void @select_value(i8*, i32, i32, i8* nocapture readonly, i8* nocapture readonly, i8*, i32, i32, i32, i8*, i32, i32) #2 //select value will only write to the first parameter, other parameter is readonly

For the above case, the second handle_value function call should be replaced with %call with cse or gvn optimization.

Is anything I can do to optimize this case?

thanks

I’m working on the across the subprogram call optimization, and let’s say the following llvm IR,

@ng0 = internal global [2 x i32] [i32 2, i32 0], align 4

%t7 = alloca [16 x i8], align 16
%add.ptr = getelementptr inbounds i8* %t1, i64 40
%call = call i8* @handle_value(i8* %add.ptr, i32 3) #3
%arraydecay = getelementptr inbounds [8 x i8]* %t3, i64 0, i64 0
%arraydecay1 = getelementptr inbounds [16 x i8]* %t4, i64 0, i64 0
call void @select_value(i8* %arraydecay, i32 1, i32 1, i8* %call, i8* %arraydecay1, i8* null, i32 1, i32 2, i32 1, i8* bitcast ([2 x i32]* @ng0 to i8*), i32 32, i32 1) #3
%call3 = call i8* @handle_value(i8* %add.ptr, i32 3) #3

; Function Attrs: nounwind readonly
declare i8* @handle_value(i8* nocapture, i32) #1
; Function Attrs: nounwind
declare void @select_value(i8*, i32, i32, i8* nocapture readonly, i8* nocapture readonly, i8*, i32, i32, i32, i8*, i32, i32) #2 //select value will only write to the first parameter, other parameter is readonly

The function ’select_value’ is not marked as not having side effects. They compiler has to assume that it could write to memory or print to the screen so it can’t merge them.

Here, The function ’select_value’, which is an external function, only write to the first parameter and other parameters are readonly and no global value writing.
As what I know, we didn’t have an attribute for this case to indict only writing to the first actual parameter.
We should mark the first parameter as Mod, others as Ref and let cse work for readonly arguments.

Pls kindly let me know If you know any way to mark the function to ONLY kill the first parameter for the general optimization.

thanks
xiaoyong

xiaoyong liu wrote:

Here, The function ’select_value’, which is an external function, only
write to the first parameter and other parameters are readonly and no
global value writing.
As what I know, we didn't have an attribute for this case to indict only
writing to the first actual parameter.
We should mark the first parameter as Mod, others as Ref and let cse
work for readonly arguments.

Pls kindly let me know If you know any way to mark the function to ONLY
kill the first parameter for the general optimization.

We don't.

Part of the problem is that such an annotation wouldn't be useful very often. Often you have a function which doesn't write to any globals but may write to any memory "reachable" through any number of indirect loads starting with a pointer argument, and that reachability is complex enough to be equivalent to "all memory".

If you want to work on this, I think that a function attribute which states "only writes things reachable through arguments" is the place to start, then we can see how often we can prove that this is less than everything. A starting point for that is showing that it doesn't write to an internal global variable whose address is never taken.

Nick