Inserting a branch in PPCTargetLowering::LowerFormalArguments_SVR4

Hello,

The current code in PPCTargetLowering::LowerFormalArguments_SVR4
contains a FIXME over the code which saves the live floating-point
registers to the stack. The FIXME states that this should only be done
if CR bit 6 is set. I've been told that the lack of this check is
preventing clang/LLVM from compiling a functional FreeBSD kernel on PPC.

Is is possible to insert another branch in LowerFormalArguments? Some
of the atomic instructions insert branches, but those are handled in
EmitInstrWithCustomInserter. Can branches be inserted inside
LowerFormalArguments in the same way?

Thanks in advance,
Hal

I hate to be bothersome, but can someone please comment on this?

Thanks again,
Hal

Hi Hal,

For lowering code that requires inserting branches, you need to use a custom inserter, yes. Theoretically, that does indeed sound like what you want to do here.

It's complicated by the general structure of argument passing, though. In particular, there's lots of assumptions about the call sequence stuff. I don't know if things are smart enough (EH in particular worries me) to handle this sort of thing. On the plus side, there are copious assert() bits in there to catch it if the compiler gets itself confused, so you should know sooner rather than later if something goes wonky.

-Jim

Jim,

Thanks! Might it be possible to do this another way? For example, could
I insert some special pseudo-instructions and then turn them into
branches later on?

-Hal

Hi Hal,

Yes, you could definitely expand them later. Custom inserters are effectively just a very early expand pseudo instruction. You can even expand them very late in the MC lowering (asmprinter) if you really need to. That could get tricky, as you'll effectively be fibbing to the register allocator. ARM uses quite a lot of late expanded pseudo-instructions, for example, though generally not w/ branching involved, so the regalloc implications are more easily modeled.

From the context, I gather that CR6 changes state dynamically over the course of a program's execution? It's not a compile time determinable thing? If it were compile time, you could deal with this as a different calling convention instead, which changes which regs are call clobbered vs. call preserved. Sounds like this is more complicated than that, though.

-Jim

Hi Hal,

Yes, you could definitely expand them later. Custom inserters are
effectively just a very early expand pseudo instruction. You can even
expand them very late in the MC lowering (asmprinter) if you really
need to. That could get tricky, as you'll effectively be fibbing to
the register allocator.

This sounds dangerous :wink: -- I suppose that so long as no additional
registers (that might be live at that point) are used by the branching,
then it might be okay.

ARM uses quite a lot of late expanded

pseudo-instructions, for example, though generally not w/ branching
involved, so the regalloc implications are more easily modeled.

From the context, I gather that CR6 changes state dynamically over
the course of a program's execution? It's not a compile time
determinable thing? If it were compile time, you could deal with this
as a different calling convention instead, which changes which regs
are call clobbered vs. call preserved. Sounds like this is more
complicated than that, though.

This is an unfortunate special case, and it applies to va_arg functions.
The PPC32 ABI spec says:

A caller of a function that takes a variable argument list shall set
condition register bit 6 to 1 if it passes one or more arguments in
the floating-point registers. It is strongly recommended that the
caller set the bit to 0 otherwise, using the creqv 6, 6, 6 (set to 1)
or crxor 6, 6, 6 (set to 0) instruction.

The motivation for using the
condition register bit is twofold. First, a function that takes a
variable argument list may test condition register bit 6 to determine
whether or not to store the floating- point argument registers in
memory, thereby making execution of such functions more efficient
when there are no floating-point arguments. Second, programs that do
not otherwise use floating point need not acquire a floating-point
state, with the attendant saving and restoring of the state on
context switches, merely because they call functions with variable
argument lists.

It seems that kernel-mode code (on FreeBSD and perhaps other systems as
well) depends on this behavior.

-Hal