Deopt operand bundle behavior

Hi!

We have started to use deopt operand bundle to make our native stacktrace deoptimizable and garbage collectable. We stumbled upon an issue and we don't know if it is really an issue on our side or really a problem within LLVM.

For example, for this input:

declare { i8*, i8* } @getCode()

define void @testFunc() {
entry:
%0 = call { i8*, i8* } @getCode()
%1 = extractvalue { i8*, i8* } %0, 1
%2 = bitcast i8* %1 to void ()*
call void %2() [ "deopt"() ]
ret void
}

We get this output machine code for x86_64:

_testFunc: ## @testFunc
   .cfi_startproc
## BB#0: ## %entry
   pushq %rax
Lcfi0:
   .cfi_def_cfa_offset 16
   callq _getCode
   callq *%rax
Ltmp0:
   popq %rax
   retq

Without the deopt operand bundle:

_testFunc: ## @testFunc
   .cfi_startproc
## BB#0: ## %entry
   pushq %rax
Lcfi0:
   .cfi_def_cfa_offset 16
   callq _getCode
   callq *%rdx
   popq %rax
   retq

For some reason with the deopt operand bundle for the second half of the value returned by getCode the wrong register is used, namingly %rax instead of %rdx.

Am I not aware of something regarding to this feature?

Thanks ahead for your time,
Daniel Mihalyi

Hi,

Are you seeing this issue in general, or only with aggregate return values?

If the latter, then I suspect this is a bug specifically around
lowering aggregate return values from calls with deopt bundles. We
(Azul) do not use aggregate types in function boundaries, so that area
is definitely not well tested.

If you want to debug this, I'd suggest starting to look at
SelectionDAGBuilder::LowerAsSTATEPOINT and
SelectionDAGBuilder::LowerCallSiteWithDeoptBundleImpl. It is probably
just an oversight, and not a fundamental issue.

Given that you have a tiny reproducer I can take a look at it too, but
I cannot guarantee a timely response -- I'm fairly time constrained at
this point.

I'm also very interested in hearing about new uses of deopt operand
bundles. If you can share some details on what you're doing with it,
that'll be great! Note that if you're working with a *relocating*
collector (i.e. your GC copies objects to new addresses) then deopt
operand bundles is not sufficient for GC (though it will still let you
deoptimize) -- you'll need to use gc.statepoint to get proper
semantics.

Thanks,
-- Sanjoy

Hi!

Thank you for your insight.

This is the only case I have encountered so far. Also, if I switch to -O0 from -O2, then somehow the right register is used as callee.

Btw., we intend to use this feature with the help of libunwind to retrieve info for instrumentation and (for now) non-moving garbage collectors.

Daniel Mihalyi