Problem with __builtin_object_size when it depends on a condition

Optimizer doesn't know how to calculate the object size when it finds condition that cannot be eliminated. There is example:

From your email, ISTM that you’re proposing that we insert runtime checks to determine the object size, which breaks our guarantee to always lower to a constant. Is this correct? If not, can you please provide an example of the new IR for the function you gave?

Either way, is there a reason that we can’t just use the second flag to objectsize here? That flag was designed with cases like this in mind:

int foo(int cond) {
char small[10], large[30];
void *what = cond ? small : large;
// if X = 0 or 1, hand back 30 (max possible size), for 2 or 3, hand back 10 (minimum possible size)
// if X = 0 or 1, this gets lowered to @llvm.objectsize(…, 0), if X = 2 or 3, it gets lowered to @llvm.objectsize(…, 1)
return __builtin_object_size(what, X);
}

FWIW, If you’re looking for something more accurate than this, and you’re willing to have calculations at runtime in exchange for the accuracy, I’d recommend looking into the machinery involved with the ObjectSizeOffsetEvaluator (if you haven’t already). I’m not familiar with any of it, but you may find it interesting. :slight_smile:

Optimizer doesn't know how to calculate the object size when it finds condition that cannot be eliminated. There is example:

-----------------------------------------------
#include<stdlib.h>
#define STATIC_BUF_SIZE 10
#define LARGER_BUF_SIZE 30

size_t foo(int flag) {
char *cptr;
char chararray[LARGER_BUF_SIZE];
char chararray2[STATIC_BUF_SIZE];
if(flag)
   cptr = chararray2;
  else
   cptr = chararray;

return __builtin_object_size(cptr, 2);
}

int main() {
size_t ret;
ret = foo(0);
printf("\n%d\n", ret);
return 0;
}
----------------------------------------------
If you try to compile this example with clang (trunk version) with option -fno-inline the result will be -1.

I get `0` when I run this, which is correct because you're asking
for type "2". `-1` would be a fairly major bug. Was this a typo
on your part, or can you somehow reproduce `-1`?

I’m not trying to add runtime checks, I want to transform llvm.objectsize into constant in a different place.

Here is the .ll prints that might explain better what I’m doing. This is for example with inlining enabled. In case
that inlining is disabled, the same transformations happen except for main. Main only has a call to foo().

Combine redundant instruction for function foo() will calculate minimum or maximum value depends on second argument,
and put it in third argument. We must save this value to third argument so that inliner has a chance to eliminate
the condition. If we replace with constant in foo() then we get wrong value once inlined in main.

*** IR Dump After Simplify the CFG ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
%chararray = alloca [30 x i8], align 16
%chararray2 = alloca [10 x i8], align 1
%0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, i64 0
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, i64 0
call void @llvm.lifetime.start(i64 10, i8* %1) #5
%tobool = icmp eq i32 %flag, 0
%cptr.0 = select i1 %tobool, i8* %0, i8* %1
%2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 0)
call void @llvm.lifetime.end(i64 10, i8* %1) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
ret i64 %2
}
*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
%chararray = alloca [30 x i8], align 16
%chararray2 = alloca [10 x i8], align 1
%0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, i64 0
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, i64 0
call void @llvm.lifetime.start(i64 10, i8* %1) #5
%tobool = icmp eq i32 %flag, 0
%cptr.0 = select i1 %tobool, i8* %0, i8* %1
%2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 10)
call void @llvm.lifetime.end(i64 10, i8* %1) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
ret i64 %2
}

Combining redundant instructions for main calculates correct value (30) because foo() is inlined and now the condition is
eliminated. If we didn’t leave the object size in foo(), the value would be 10.

*** IR Dump After Simplify the CFG ***
; Function Attrs: nounwind uwtable
define i32 @main() #3 {
entry:
%chararray.i = alloca [30 x i8], align 16
%0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray.i, i64 0, i64 0
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %0, i1 true, i32 10) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
%call1 = call i32 (i8*, …) @printf(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), i64 %1)
ret i32 0
}
*** IR Dump After Combine redundant instructions ***
; Function Attrs: nounwind uwtable
define i32 @main() #3 {
entry:
%call1 = call i32 (i8*, …) @printf(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @.str, i64 0, i64 0), i64 30)
ret i32 0
}

Codegen prepare will finaly replace the llvm.object size with constant in function foo.

*** IR Dump After Partially inline calls to library functions ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
%chararray = alloca [30 x i8], align 16
%chararray2 = alloca [10 x i8], align 1
%0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, i64 0
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, i64 0
call void @llvm.lifetime.start(i64 10, i8* %1) #5
%tobool = icmp eq i32 %flag, 0
%cptr.0 = select i1 %tobool, i8* %0, i8* %1
%2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 10)
call void @llvm.lifetime.end(i64 10, i8* %1) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
ret i64 %2
}
*** IR Dump After CodeGen Prepare ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
%chararray = alloca [30 x i8], align 16
%chararray2 = alloca [10 x i8], align 1
%0 = bitcast [30 x i8]* %chararray to i8*
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = bitcast [10 x i8]* %chararray2 to i8*
call void @llvm.lifetime.start(i64 10, i8* %1) #5
%tobool = icmp eq i32 %flag, 0
%cptr.0 = select i1 %tobool, i8* %0, i8* %1
call void @llvm.lifetime.end(i64 10, i8* %1) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
ret i64 10
}

I made a mistake here, I get zero same as you. I want to fix it to get correct value.

Okay, good.

FTR, 0 *is* a correct value. You want to fix it to get a more accurate
number, but 0 is always correct for type 2.

Ahh, that makes much more sense then; thanks for the detailed illustration. :slight_smile:

Though, in your example:

*** IR Dump After Partially inline calls to library functions ***
; Function Attrs: nounwind readnone uwtable
define i64 @foo(i32 %flag) #0 {
entry:
%chararray = alloca [30 x i8], align 16
%chararray2 = alloca [10 x i8], align 1
%0 = getelementptr inbounds [30 x i8], [30 x i8]* %chararray, i64 0, i64 0
call void @llvm.lifetime.start(i64 30, i8* %0) #5
%1 = getelementptr inbounds [10 x i8], [10 x i8]* %chararray2, i64 0, i64 0
call void @llvm.lifetime.start(i64 10, i8* %1) #5
%tobool = icmp eq i32 %flag, 0
%cptr.0 = select i1 %tobool, i8* %0, i8* %1
%2 = call i64 @llvm.objectsize.i64.p0i8.i32(i8* %cptr.0, i1 true, i32 10)
call void @llvm.lifetime.end(i64 10, i8* %1) #5
call void @llvm.lifetime.end(i64 30, i8* %0) #5
ret i64 %2
}

Why is the third argument to @llvm.objectsize necessary? It seems that we have enough information to just compute “10” during CGP. Are there cases where we wouldn’t be able to do this?

Well, I think I can do that part how you said (calculate “10” during CGP).
When I finish the patch I will post it on phabricator. Thanks for comments.