Branch delay slots broken.

The Sparc, Microblaze, and Mips code generators implement branch delay slots. They all seem to exhibit the same bug, which is not surprising since the code is very similar. If I compile code with this snippit:

   while (n--)
     *s++ = (char) c;

I get this (for the Microblaze):

         swi r19, r1, 0
         add r3, r0, r0
         cmp r3, r3, r7
         beqid r3, ($BB0_3)
         brid ($BB0_1)
         add r19, r1, r0
         add r3, r5, r0
$BB0_2:
         addi r4, r3, 1
         addi r7, r7, -1
         add r8, r0, r0
         sbi r6, r3, 0
         cmp r8, r8, r7
         bneid r8, ($BB0_2)
         brid ($BB0_3)
         add r3, r4, r0
$BB0_3:

Notice that the label $BB0_1 is missing. If I disable filling in the branch delay slots, I get:

         swi r19, r1, 0
         add r19, r1, r0
         add r3, r0, r0
         cmp r3, r3, r7
         beqid r3, ($BB0_3)
         brid ($BB0_1)
$BB0_1:
         add r3, r5, r0
$BB0_2:
         addi r4, r3, 1
         addi r7, r7, -1
         add r8, r0, r0
         sbi r6, r3, 0
         cmp r8, r8, r7
         add r3, r4, r0
         bneid r8, ($BB0_2)
         brid ($BB0_3)
$BB0_3:

A similar thing happens with the Mips and Sparc, although they just fill in the delay slots with NOPs.

My question is, why would inserting a NOP or splicing an instruction in a MachineBasicBlock cause a label to go missing? Could someone point me to appropriate places to look?

FYI, clang is the front end.

-Rich

Notice that the label $BB0_1 is missing. If I disable filling in the
branch delay slots, I get:

Is this with the latest SVN HEAD version of LLVM or some other version? The delay slot filler and many other things have been updated for the Microblaze backend. In particular, the commit r120095 for the MBlaze backend fixed some issues with missing block labels due to AsmPrinter::isBlockOnlyReachableByFallthrough not taking into account delay slots in basic blocks.

My question is, why would inserting a NOP or splicing an instruction in
a MachineBasicBlock cause a label to go missing? Could someone point me
to appropriate places to look?

The problem is that AsmPrinter::isBlockOnlyReachableByFallthrough assumes that the last instruction in a basic block is a terminator instruction, but when the terminator has a delay slot then the last instruction is not a terminator; its is a delay slot filler. This can lead AsmPrinter to think that some blocks can only be reach by a fallthrough condition even when they are branched to. When this happens it omits the block label and the assembly becomes invalid.

I fixed this in the MBlaze backend by overriding AsmPrinter::isBlockOnlyReachableByFallthrough and replaced (at the end of the function):

// Otherwise, check the last instruction.
const MachineInstr &LastInst = Pred->back();
return !LastInst.getDesc().isBarrier();

with:

// Check if the last terminator is an unconditional branch.
MachineBasicBlock::const_iterator I = Pred->end();
while (I != Pred->begin() && !(--I)->getDesc().isTerminator())
   ; // Noop
return I == Pred->end() || !I->getDesc().isBarrier();

Notice, that the original version assumes that the last instruction is the last terminator for the basic block whereas the version in the latest Microblaze backend searches for the last terminator in the basic block.

As a note, when I run the code you gave through the latest version of the Microblaze backend I do not have the problems that you describe any longer.

[snip]

Thanks Wesley! I'm currently using r119910. I'll update.

-Rich

Wesley,

You changes work for me also. I applied something similar to the Mips CG and they worked there also. Thanks again.

-Rich

Hi Richard,

You changes work for me also. I applied something similar to the Mips CG
and they worked there also. Thanks again.

I can't reproduce the same problem here for Mips using clang, could
you please attach the bitcode you used?
Thanks

Hi Bruno,

This was the bitcode from a simple memset function:

; ModuleID = '../../../src/c/string/memset.c'
target datalayout = "E-p:32:32:32-i1:8:8-i8:8:32-i16:16:32-i32:32:32-i64:32:64-f32:32:32-f64:64:64-v64:64:64-n32"
target triple = "mips-ellcc-unknown"

define i8* @memset(i8* %m, i32 %c, i32 %n) nounwind {
entry:
   %tobool5 = icmp eq i32 %n, 0
   br i1 %tobool5, label %while.end, label %bb.nph

bb.nph: ; preds = %entry
   %conv = trunc i32 %c to i8
   br label %while.cond

while.cond: ; preds = %while.cond, %bb.nph
   %indvar = phi i32 [ 0, %bb.nph ], [ %indvar.next, %while.cond ]
   %s.07 = getelementptr i8* %m, i32 %indvar
   store i8 %conv, i8* %s.07, align 1, !tbaa !0
   %indvar.next = add i32 %indvar, 1
   %exitcond = icmp eq i32 %indvar.next, %n
   br i1 %exitcond, label %while.end, label %while.cond

while.end: ; preds = %while.cond, %entry
   ret i8* %m
}

As I recall, it was the bb.nph label that was lost.

-Rich

I still can't reproduce because it's fixed in trunk for Mips and Sparc
for a long time now, you must be using a really old version! :slight_smile:
Just for reference, it was fixed for Mips and Sparc in r108820 and r96492.
Thanks anyway!

You're right Bruno. I must have been sleepy when I thought I saw it wasn't working.

Sorry about the false alarm.

-Rich

Bruno,

I figured out what triggers the problem. Clang defaults to -O0. If you run that bitcode through llc with -O0, the label gets lost. -O1 and above is OK.

-Rich