[buildSchedGraph] memory dependencies

Hi,

(This only concerns MISNeedChainEdge(), and is separate from D8705)

I found out that the MIScheduler (pre-ra) could not handle a simple test case (test/CodeGen/SystemZ/alias-01.ll), with 16 independent load / add / stores.

The buildSchedGraph() put too many edges between memory accesses, because
1) There was no implementation of areMemAccessesTriviallyDisjoint() for SystemZ.
2) Type Based Alias Analysis (TBAA) was not used.

I have gotten rid of the edges on an experimental level, and would now like some help and feedback:

1): It seems common for targets to check for the same virtual address register and then figure out via offsets/sizes if they overlap. When faced with the task of doing this, I realized that the MachineMemOperands might do the job. I tried the following and wonder why this is not done already, perhaps as a default implementation of areMemAccessesTriviallyDisjoint(). Is it because memory operands may be missing, or is it because MachineCombiner may give the register based analysis an advantage? This is a check in the case of the *same Value*. In this case the Value is an argument, which is unsafe against others, but I am thinking it should at least be safe against itself...

diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp b/lib/CodeGen/ScheduleDAGInstrs.cpp
index 00a0b0f..cd48f51 100644
--- a/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA, const MachineFrameInfo *MFI,
    if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
      return true;

+ // If mem-operands show that the same address Value is used by both
+ // ("normal") instructions, simply check offsets and sizes of the
+ // accesses.
+ MachineMemOperand *MMOa = *MIa->memoperands_begin();
+ MachineMemOperand *MMOb = *MIb->memoperands_begin();
+ const Value *VALa = MMOa->getValue();
+ const Value *VALb = MMOb->getValue();
+ if (VALa == VALb &&
+ !MIa->hasUnmodeledSideEffects() && !MIb->hasUnmodeledSideEffects() &&
+ !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) {
+ int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
+ int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
+ int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
+ int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
+ int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
+ if (LowOffset + LowWidth <= HighOffset)
+ return false;
+ }

From: "Jonas Paulsson via llvm-dev" <llvm-dev@lists.llvm.org>
To: "llvm-dev" <llvm-dev@lists.llvm.org>
Sent: Wednesday, February 3, 2016 4:41:33 AM
Subject: [llvm-dev] [buildSchedGraph] memory dependencies

Hi,

(This only concerns MISNeedChainEdge(), and is separate from D8705)

I found out that the MIScheduler (pre-ra) could not handle a simple
test
case (test/CodeGen/SystemZ/alias-01.ll), with 16 independent load /
add
/ stores.

The buildSchedGraph() put too many edges between memory accesses,
because
1) There was no implementation of areMemAccessesTriviallyDisjoint()
for
SystemZ.
2) Type Based Alias Analysis (TBAA) was not used.

I have gotten rid of the edges on an experimental level, and would
now
like some help and feedback:

1): It seems common for targets to check for the same virtual address
register and then figure out via offsets/sizes if they overlap. When
faced with the task of doing this, I realized that the
MachineMemOperands might do the job. I tried the following and wonder
why this is not done already, perhaps as a default implementation of
areMemAccessesTriviallyDisjoint(). Is it because memory operands may
be
missing, or is it because MachineCombiner may give the register based
analysis an advantage? This is a check in the case of the *same
Value*.
In this case the Value is an argument, which is unsafe against
others,
but I am thinking it should at least be safe against itself...

diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp
b/lib/CodeGen/ScheduleDAGInstrs.cpp
index 00a0b0f..cd48f51 100644
--- a/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA,
const MachineFrameInfo *MFI,
    if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
      return true;

+ // If mem-operands show that the same address Value is used by
both
+ // ("normal") instructions, simply check offsets and sizes of the
+ // accesses.
+ MachineMemOperand *MMOa = *MIa->memoperands_begin();
+ MachineMemOperand *MMOb = *MIb->memoperands_begin();
+ const Value *VALa = MMOa->getValue();
+ const Value *VALb = MMOb->getValue();
+ if (VALa == VALb &&
+ !MIa->hasUnmodeledSideEffects() &&
!MIb->hasUnmodeledSideEffects() &&
+ !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) {
+ int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
+ int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
+ int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
+ int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
+ int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
+ if (LowOffset + LowWidth <= HighOffset)
+ return false;
+ }
+
    if (isUnsafeMemoryObject(MIa, MFI, DL) ||
    isUnsafeMemoryObject(MIb,
MFI, DL))
      return true;

This looks reasonable to me.

2) The TBAA tags should separate the loads from the stores. In the MF
I see
...
%vreg32<def> = L %vreg0, 56, %noreg;
mem:LD4[%src1(align=64)+56](align=8)(tbaa=!1) GR32Bit:%vreg32
ADDR64Bit:%vreg0
...
ST %vreg33, %vreg1, 60, %noreg;
mem:ST4[%dest(align=64)+60](align=4)(tbaa=!4) GR32Bit:%vreg33
ADDR64Bit:%vreg1
...

Since the tbba tags are part of the MachineMemOperands, it is easy to
just continue the patch (above isUnsafeMemoryObject() checks) with

+ AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB =
MMOb->getAAInfo();
+ if (AATagsA->TBAA && AATagsB->TBAA && AATagsA != AATagsB)
+ return false;

I see that there is a TBAA method that is intended to give this type
of
result. So I wonder why this is not used?

Would it be correct to implement SystemZs
areMemAccessesTriviallyDisjoint() with the above code? (I am
suspecting
that perhaps my AAMDNodes check is incomplete)

I don't understand this. The whole point of allowing backends use AA is to let the existing AA infrastructure take care of this. As I recall, SystemZ enabled AA in the backend, so why do you need anything special here?

-Hal

diff --git a/lib/CodeGen/ScheduleDAGInstrs.cpp
b/lib/CodeGen/ScheduleDAGInstrs.cpp
index 00a0b0f..cd48f51 100644
--- a/lib/CodeGen/ScheduleDAGInstrs.cpp
+++ b/lib/CodeGen/ScheduleDAGInstrs.cpp
@@ -584,6 +584,25 @@ static bool MIsNeedChainEdge(AliasAnalysis *AA,
const MachineFrameInfo *MFI,
     if (!MIa->hasOneMemOperand() || !MIb->hasOneMemOperand())
       return true;

+ // If mem-operands show that the same address Value is used by
both
+ // ("normal") instructions, simply check offsets and sizes of the
+ // accesses.
+ MachineMemOperand *MMOa = *MIa->memoperands_begin();
+ MachineMemOperand *MMOb = *MIb->memoperands_begin();
+ const Value *VALa = MMOa->getValue();
+ const Value *VALb = MMOb->getValue();
+ if (VALa == VALb &&
+ !MIa->hasUnmodeledSideEffects() &&
!MIb->hasUnmodeledSideEffects() &&
+ !MIa->hasOrderedMemoryRef() && !MIb->hasOrderedMemoryRef()) {
+ int OffsetA = MMOa->getOffset(), OffsetB = MMOb->getOffset();
+ int WidthA = MMOa->getSize(), WidthB = MMOb->getSize();
+ int LowOffset = OffsetA < OffsetB ? OffsetA : OffsetB;
+ int HighOffset = OffsetA < OffsetB ? OffsetB : OffsetA;
+ int LowWidth = (LowOffset == OffsetA) ? WidthA : WidthB;
+ if (LowOffset + LowWidth <= HighOffset)
+ return false;
+ }
+
     if (isUnsafeMemoryObject(MIa, MFI, DL) ||
     isUnsafeMemoryObject(MIb,
MFI, DL))
       return true;

This looks reasonable to me.

OK - I will try this then for SystemZ as a starting point, and I also need to do handle PseudoSourceValues, I guess.
It would be great to try this on a target that has done the register analysis, and see if that is worth the effort... Any idea how to do that?

+ AAMDNodes AATagsA = MMOa->getAAInfo(), AATagsB =
MMOb->getAAInfo();
+ if (AATagsA->TBAA && AATagsB->TBAA && AATagsA != AATagsB)
+ return false;

I see that there is a TBAA method that is intended to give this type
of
result. So I wonder why this is not used?

Would it be correct to implement SystemZs
areMemAccessesTriviallyDisjoint() with the above code? (I am
suspecting
that perhaps my AAMDNodes check is incomplete)
I don't understand this. The whole point of allowing backends use AA is to let the existing AA infrastructure take care of this. As I recall, SystemZ enabled AA in the backend, so why do you need anything special here?

  -Hal

I think I realized what was wrong: The AA was never called in the first place, because the old isUnsafeMemoryObject() rejected the MI, because the value could not be identified (argument). With the new patch applied (D8705), AA was called and it works as you expected.

thanks,

Jonas