Indirect calls with type mismatch

We have this test in Transforms/InstCombine/apint-call-cast-target.ll:

declare i32 @main2()

define ptr @ctime(ptr) {
; CHECK-LABEL: @ctime(
; CHECK-NEXT:  entry:
; CHECK-NEXT:    [[I0:%.*]] = call i32 @main2()
; CHECK-NEXT:    [[TMP1:%.*]] = inttoptr i32 [[I0]] to ptr
; CHECK-NEXT:    ret ptr [[TMP1]]
  %i0 = call ptr @main2( )
  ret ptr %i0

That call in the input is an indirect call through the pointer @main2. It turns out that the global bar is a function, so instcombine replaces the indirect call with a direct call.
But the signatures don’t match. There are a bunch of tests where this happens, but most are just typos (call void @use with incorrect number of arguments).
But this test is checking specifically that an appropriate bitcast is added. I’m always worried with implicits inttoptr and ptrtoint in general as they fly under the radar for alias analysis. So I would prefer to consider this case UB as the signature doesn’t match.

Is there any reason against?


I tried to collect what we want to consider ub in these situations recently. There are target hook patches and an discussion thread (can’t find the link right now, sorry). Long story short, people think most of this should be supported, very little things are agreed to be ub.

Right, I remember your thread. I just wanted to bring attention to this specific case, as it deals with magic integer<->pointer conversion, which is evil.
Specially when the bit-width doesn’t even match. This will crash the code produced by a stack-based backend. By allowing this, we are effectively forbidding stack machines from using LLVM.

I don’t understand how llvm-ir allowing or forbidden this makes it important for a backend. Could you elaborate?

The transform should only trigger if the integer type has the same width as the pointer type, I think.

The instcombine transform does assume integers are passed the same way as pointers of the same size, but that hasn’t been an issue. Probably we could make it more conservative without much impact. (The overall transform used to be much more important before we had opaque pointers, because there were a bunch of pointer types floating around. These days, I expect it primarily triggers for C code without function prototypes.)

The problem with declaring it UB is that we don’t really have formalized LLVM ABI rules beyond the platform ABI specs, so it’s not clear under what circumstances a value is supposed to be passed as a pointer vs. an integer. (For example, consider a union with a pointer and an integer; is it supposed to be passed as an integer, or a pointer?)

Assume that the ABI states that the return value goes through the stack and that we have this program:

int f() {...}

int g() {
   call ptr @f()

f() pushes 4 bytes on the stack and g() pops 8 bytes from the stack. This will screw up the stack layout assumed by the compiler (it will “eat” 4 bytes from something else, maybe even the return pointer) and likely crash.

So I think there’s no other choice than declaring UB at least when there’s a mismatch in the bitwidth of the return value or the arguments (as the same stack layout argument applies).
I would say that we need it to be UB whenever there’s an implicit ptr2int or int2ptr, but one thing at a time :slight_smile:

To more we make UB, the better for me :slight_smile: