Llvm.stackprotector() and GlobalISel

Hi!

I try to enable the stack protector generation on my my target. The issue I stumbled across can also be seen on other targets, e.g. arm64-linux.

The IR instrumentation before the IRTranslator pass looks like (when SelectionDAG SSP is used):

    %StackGuardSlot = alloca ptr, align 8
    %0 = call ptr @llvm.stackguard()
    call void @llvm.stackprotector(ptr %0, ptr %StackGuardSlot)

After the IRTranslator pass, the generic machine instructions looks like:

    %0:_(p0) = G_FRAME_INDEX %stack.0.StackGuardSlot
    %1:gpr64sp(p0) = LOAD_STACK_GUARD :: (dereferenceable invariant load (p0) from @__stack_chk_guard)
    %2:gpr64sp(p0) = LOAD_STACK_GUARD :: (dereferenceable invariant load (p0) from @__stack_chk_guard)
    G_STORE %2(p0), %0(p0) :: (volatile store (p0) into %stack.0.StackGuardSlot)

One LOAD_STACK_GUARD instruction is generated by llvm.stackguard(), the other by llvm.stackprotector(). The latter happens because the coding in IRTranslator::translateKnownIntrinsic() ignores the value of the first parameter, and generates the LOAD_STACK_GUARD instruction instead.
This looks like a deliberate decision, as there is also the arm64-irtranslator-stackprotect.ll test which explicitly sets the first argument to undef and checks that the llvm.stackprotector() call generates the LOAD_STACK_GUARD instruction.

My target OS is OpenBSD, and with this it gets a bit more funny because it works only with IR instrumentation. Since getSDagStackGuard() returns nullptr, I end up with a crippled LOAD_STACK_GUARD instruction:

    %0:_(p0) = G_FRAME_INDEX %stack.0.StackGuardSlot
    %1:_(p0) = G_LOAD %2(p0) :: (volatile dereferenceable load (p0) from @__guard_local)
    %3:gprrc(p0) = LOAD_STACK_GUARD
    G_STORE %3(p0), %0(p0) :: (volatile store (p0) into %stack.0.StackGuardSlot, align 8)

The LLVM documentation for llvm.stackprotect clearly states that the " intrinsic takes the guard and stores it onto the stack at slot".

My question is why is the first argument ignored? If this is a limitation of GlobalISel, then there should be at least a comment in the source, and maybe the documentation should be updated.
My first approach was replace the call to getStackGuard() with a `COPY´ of the first argument. This works in my case, but it breaks at least the mentioned test case, so I am unsure if there are other reasons behind this behaviour.

Best regards,
Kai

I don’t understand these intrinsics, but the DAG implementation suggests they have different behavior based on the target.

   if (TLI.useLoadStackGuardNode())
      Src = getLoadStackGuard(DAG, sdl, Chain);
    else
      Src = getValue(I.getArgOperand(0));   // The guard's value.

IRTranslator is missing a comparable check for useLoadStackGuardNode

Thanks, I did not look into the SDAG lowering code. This check would actually solve my problem.

What is still confusing is that StackProtector::getStackGuard() method does not check for TLI.useLoadStackGuardNode(), and always inserts the llvm.stackguard() intrinsic. This leads to the duplicate LOAD_STACK_GUARD instructions. But that’s a different problem…

Kai

I created D129505 to add the check.