EFLAGS and MVT::Glue

The log message for revision 122213 says:

Change the X86 backend to stop using the evil ADDC/ADDE/SUBC/SUBE nodes (which
their carry depenedencies with MVT::Flag operands) and use clean and beautiful
EFLAGS dependences instead.

(MVT::Flag has since been renamed to MVT::Glue.)

That revision made bug 8404 go away.

Am I right in thinking that one of the problems with MVT::Glue is that
it is hard to guarantee that other instructions won't come between the
two instructions that are glued together? And another problem is that
you actually want to allow some instructions to come between them?

Suppose, for example, that we have a selection DAG with these edges:

X -> G1 -> G2 -> Y
X -> A -> Y

where the dependency between G1 and G2 is Glue.

Then if instruction selection replaces X and G1 by a single
instruction I1, and G2 and Y by a single instruction I2, then we end
up with this DAG:

I1 -> I2
I1 -> A -> I2

Now I1 and I2 are glued together, but A has to come between them. It's
hard to know when a combination of rules that each look all right on
their own might lead to this happening.

I'm guessing that something like this lead to the "Wrong topological
sorting" assertion failure in bug 8404. (I've seen the same assertion
failure in some other cases where I have more reason to think that
that's roughly what happened.)

A real example to consider might be code like this:

  do {
    a[i] -= b[i];
  } while (a[i++] >= 0);

I'm currently getting ARM code like this:

.LBB0_1:
        ldr r2, [r1], #4
        ldr r3, [r0]
        sub r2, r3, r2
        str r2, [r0], #4
        cmp r2, #0
        bge .LBB0_1

This could be improved, I think, by getting the subtract to set the
flags instead of comparing with zero, like this:

.LBB0_1:
        ldr r2, [r1], #4
        ldr r3, [r0]
        subs r2, r3, r2
        str r2, [r0], #4
        bpl .LBB0_1

But that code might be hard to generate if the flag-setting SUBS is
"glued" to the conditional branch BPL so you can't put the store
between them.

So, some questions:

* Is there a set of rules for using Glue to avoid getting "Wrong
  topological sorting"?

* If not, should we be avoiding Glue altogether?

* How hard is it to replace Glue in other back ends by something like
  EFLAGS? Should we be doing that?

Edmund

Hello Edmund,

Am I right in thinking that one of the problems with MVT::Glue is that
it is hard to guarantee that other instructions won't come between the
two instructions that are glued together?

No. This is actually why MVT::Glue was introduced: to always glue
stuff altogether.

* How hard is it to replace Glue in other back ends by something like
EFLAGS? Should we be doing that?

This is not possible for ARM right now, because latency-modelling scheduler
cannot handle physreg deps (there is PR for this). But in general, adding more
"degrees of freedom" is a generic goodness.

I’d like to point out some issues I encountered when using MVT::Flag
Basically, applying some transformations might introduce some cycles in the graph (it’s not a DAG anymore !).

Let’s consider your example:
sub r2, r3, r2
cmp r2, #0
bge .LBB0_1

The initial graph might look like this one:
I1: sub(r3,r2) ==> I2: CopyToReg
I1 ==> I3: cmp r2, #0
I2 (input chain) & I3 (condition code) ==> I4: bge
with I3 and I4 glued together.

Now let’s assume you remove the cmp instruction:
I1: sub(r3,r2) ==> I2: CopyToReg
I2(input chain) & I1 (condition code) ==> I4:bge
with I1 and I4 glued together.
And we just created a cycle in the graph:

  • a successor of I1 is I4
  • a successor of I4 is I2 (because I1 and I4 are glued & I2 is a successor of I1)
  • a direct successor if I2 is I4
    and so the cycle is there !

Please tell me if I missed something… But to me, it’s “easy” to have some cycles in the graph even if it still looks a DAG when you draw the graph.

Damien