How to legalize i32xi32=i64 multiply

DagCombine()

case Intrinsic::arc_ACC64_NOGUARD_MPYWHFL: {

SDValue mpy(DAG.getMachineNode(ARC::MPYWHFL_null_rr, SDLoc(N), MVT::Glue, N->getOperand(1), N->getOperand(2)), 0);

SDValue hi = DAG.getCopyFromReg(mpy, SDLoc(N), ARC::ACCHI, MVT::i32);

SDValue lo = DAG.getCopyFromReg(mpy, SDLoc(N), ARC::ACCLO, MVT::i32);

return DAG.getNode(ISD::BUILD_PAIR,SDLoc(N), MVT::i64, hi, hig);

Hi, my architecture will perform a fixed point multiply (i32 x i32) and store the saturated result in a 64-bit accumulator. MVT::i64 is not legal on my target.

This code sequence attempts to transform an:

i64 = MPY i32,i32

0xbf4382c: i32,ch = load 0xbf1dc2c, 0xbf43688, 0xbf437a0<LD4@a31> [ORD=1]

0xbf43944: i32,ch = load 0xbf1dc2c, 0xbf438b8, 0xbf437a0<LD4@b31> [ORD=2]

0xbf439d0: i32 = TargetConstant<135>

0xbf43a5c: i64 = llvm.arc.ACC64.NOGUARD.MUL.SQ31.SQ31 0xbf439d0, 0xbf4382c, 0xbf43944 [ORD=3]

To:

Ch = MPY i32,i32

I32,ch = Copy Accumulator.hi

I32,ch = Copy Accumulator.lo

I64 = BUILD_PAIR lo,hi

0xbf43f48: i32,ch = CopyFromReg 0xbf43e30, 0xbf43ebc [ORD=3]

0xbf44060: i32,ch = CopyFromReg 0xbf43e30, 0xbf43fd4 [ORD=3]

0xbf440ec: i64 = build_pair 0xbf43f48, 0xbf44060 [ORD=3]

0xbf43e30: glue = MPYDF_null_rr 0xbf4382c, 0xbf43944 [ORD=3]

The problem here is the multiply is ordered wrong. I think I have the glue specified correctly in the code sequence I show at the top. What am I missing?

Thanks in advance

Hi Mark,

The problem here is the multiply is ordered wrong. I think I have the glue
specified correctly in the code sequence I show at the top. What am I
missing?

There are a couple of possibilities. First, I wouldn't worry about
that printout of the DAG, it takes that form even when the glue is
doing its job properly (e.g x86 div). The dependencies are still in
place via the addresses.

The most likely issue is that glue only works when there's the
potential for a dependency. The MPYDF instruction needs to be marked
as defining the accumulator somehow, probably via a "let Defs =
[ACCHI, ACCLO]" line in its .td definition. Without that, I don't
think LLVM uses glue for its ordering, and even more problems will
crop up later (MPYDF won't take part in liveness tracking for ACC,
which is really bad).

The other slight oddity is using glue in a tree structure. I've only
ever seen it in a linear path (e.g. lo using hi's glue rather than
mpy's). I don't know if this is an actual issue, or LLVM can cope with
both, but if all else fails...

Cheers.

Tim.