[StructurizeCFG] Trouble with branches out of a loop

Hi,

I’ve been investigating the StructurizeCFG pass, and it looks like it has trouble handling CFG edges that break out of a loop and go directly to the function exit. Am I running up against a bug in the structurizer, or a general limitation of the algorithm used? As an aside, is there any documentation for the algorithm used? Is it based on a published paper?

The input IR I have is the following:

define <4 x float> @structurizer_test(<4 x float> %inp.coerce) {
%1 = extractelement <4 x float> %inp.coerce, i32 0
%2 = fcmp ogt float %1, 0.000000e+00
br i1 %2, label %.lr.ph.i, label %._crit_edge.i

.lr.ph.i: ; preds = %7, %0
%i.03.i = phi float [ %8, %7 ], [ 0.000000e+00, %0 ]
%ret.02.i = phi <4 x float> [ %5, %7 ], [ <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %0 ]
%3 = extractelement <4 x float> %ret.02.i, i32 0
%4 = fadd fast float %3, 0xBFB99999A0000000
%5 = insertelement <4 x float> %ret.02.i, float %4, i32 0
%6 = fcmp olt float %4, 5.000000e-01
br i1 %6, label %_Z9get_colorDv2_f.exit, label %7

; :7 ; preds = %.lr.ph.i
%8 = fadd fast float %i.03.i, 1.000000e+01
%9 = fcmp olt float %8, %1
br i1 %9, label %.lr.ph.i, label %._crit_edge.i

._crit_edge.i: ; preds = %7, %0
%ret.0.lcssa.i = phi <4 x float> [ <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %0 ], [ %5, %7 ]
%10 = insertelement <4 x float> %ret.0.lcssa.i, float 0.000000e+00, i32 2
br label %_Z9get_colorDv2_f.exit

_Z9get_colorDv2_f.exit: ; preds = %._crit_edge.i, %.lr.ph.i
%.0.i = phi <4 x float> [ %10, %._crit_edge.i ], [ %5, %.lr.ph.i ]
ret <4 x float> %.0.i
}

After structurization, I have a module that has what looks like a reasonable CFG, but bad branch conditions and PHIs:

define <4 x float> @structurizer_test(<4 x float> %inp.coerce) {
%1 = extractelement <4 x float> %inp.coerce, i32 0
%2 = fcmp ogt float %1, 0.000000e+00
%3 = xor i1 %2, true
br label %Flow

Flow: ; preds = %Flow1, %0
%4 = phi <4 x float> [ %14, %Flow1 ], [ <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %0 ]
%5 = phi <4 x float> [ %16, %Flow1 ], [ <float 1.000000e+00, float 1.000000e+00, float 1.000000e+00, float 1.000000e+00>, %0 ]
%6 = phi float [ %17, %Flow1 ], [ 0.000000e+00, %0 ]
%7 = phi i1 [ %18, %Flow1 ], [ %3, %0 ]
%8 = phi i1 [ false, %Flow1 ], [ %2, %0 ]
br i1 %8, label %.lr.ph.i, label %Flow1

.lr.ph.i: ; preds = %Flow
%i.03.i = phi float [ %6, %Flow ]
%ret.02.i = phi <4 x float> [ %5, %Flow ]
%9 = extractelement <4 x float> %ret.02.i, i32 0
%10 = fadd fast float %9, 0xBFB99999A0000000
%11 = insertelement <4 x float> %ret.02.i, float %10, i32 0
%12 = fcmp olt float %10, 5.000000e-01
%13 = xor i1 %12, true
br i1 %13, label %19, label %Flow2

Flow1: ; preds = %Flow2, %Flow
%14 = phi <4 x float> [ %23, %Flow2 ], [ %4, %Flow ]
%15 = phi <4 x float> [ %11, %Flow2 ], [ undef, %Flow ]
%16 = phi <4 x float> [ %24, %Flow2 ], [ %5, %Flow ]
%17 = phi float [ %25, %Flow2 ], [ %6, %Flow ]
%18 = phi i1 [ %26, %Flow2 ], [ %7, %Flow ]
br i1 true, label %Flow3, label %Flow

; :19 ; preds = %.lr.ph.i
%20 = fadd fast float %i.03.i, 1.000000e+01
%21 = fcmp olt float %20, %1
%22 = xor i1 %21, true
br label %Flow2

Flow2: ; preds = %19, %.lr.ph.i
%23 = phi <4 x float> [ %11, %19 ], [ %4, %.lr.ph.i ]
%24 = phi <4 x float> [ %11, %19 ], [ undef, %.lr.ph.i ]
%25 = phi float [ %20, %19 ], [ undef, %.lr.ph.i ]
%26 = phi i1 [ %22, %19 ], [ %7, %.lr.ph.i ]
br label %Flow1

Flow3: ; preds = %Flow1
br i1 %18, label %._crit_edge.i, label %_Z9get_colorDv2_f.exit

._crit_edge.i: ; preds = %Flow3
%ret.0.lcssa.i = phi <4 x float> [ %14, %Flow3 ]
%27 = insertelement <4 x float> %ret.0.lcssa.i, float 0.000000e+00, i32 2
br label %_Z9get_colorDv2_f.exit

_Z9get_colorDv2_f.exit: ; preds = %._crit_edge.i, %Flow3
%.0.i = phi <4 x float> [ %15, %Flow3 ], [ %27, %._crit_edge.i ]
ret <4 x float> %.0.i
}

Note the undef values in some of the PHIs and ‘i1 true’ for the loop branch condition.

structurizer.ll (1.37 KB)

+ Christian

Hi,

I've been investigating the StructurizeCFG pass, and it looks like it has
trouble handling CFG edges that break out of a loop and go directly to the
function exit. Am I running up against a bug in the structurizer, or a
general limitation of the algorithm used? As an aside, is there any
documentation for the algorithm used? Is it based on a published paper?

I'm not aware of any limitations, so this is a bug. Can you file one at
llvm.org/bugs.

-Tom

https://llvm.org/bugs/show_bug.cgi?id=25378

Thanks!

Hi Justin,

yeah entirely possible that this case isn't handled correctly.

As an aside, is there any documentation for the algorithm used? Is it based on a published paper?

Not really. IIRC I read about the general idea in some paper about efficient code generation for DSPs, but essentially I came up with most of the algorithm myself.

Basically it just does the following. Assuming you have a control flow like this:

while (x) {
     a();
     if (b())
         break;
     c();
}

It reworks the flow into something like this:

flow = true;
while (x && flow) {
     a();
     if (b())
         flow = false;
     if (flow)
         c();
}

This is done by ordering all the basic blocks so you have as less back edges as possible and as less dependencies as possible between the basic blocks.

Then all the conditions that result in the execution of a block are noted in the maps and hash tables.

And last all the edges that don't fit into a structurized form are replaced with conditional execution.

Regards,
Christian.