Pat s with destinations

I have

def : Pat<(nxv2f64 (fneg nxv2f64:$src)),
          (FNEG_ZPmZ_D (PTRUE_D 31), ZPR:$src)>;

and I get

 In anonymous_32547: Instruction 'FNEG_ZPmZ_D' expects more than the provided 2 operands!

from lvm-tblgen -gen-instr-info. I bet that I need

def : Pat<(nxv2f64 (fneg nxv2f64:$src)),
          (FNEG_ZPmZ_D ZPR:$dst, (PTRUE_D 31), ZPR:$src)>;

but where do I get $dst from?

	(FNEG_ZPmZ_D:{ *:[nxv16i8 nxv8i16 nxv4i32 nxv2i64 nxv2f16 nxv4f16 nxv8f16 nxv2bf16 nxv4bf16 nxv8bf16 nxv2f32 nxv4f32 nxv2f64] } (PTRUE_D:{ *:[nxv16i8 nxv8i16 nxv4i32 nxv2i64 nxv2f16 nxv4f16 nxv8f16 nxv2bf16 nxv4bf16 nxv8bf16 nxv2f32 nxv4f32 nxv2f64] } 31:{}), ZPR:{ *:[nxv1i1 nxv2i1 nxv4i1 nxv8i1 nxv16i1] }:$src)

It looks as if it has three operands.

There is no $dst in Pats. Double check the definition of this instruction. Look in the actual table generated for it (--print-records).

-print-records needs some help.

 defm FNEG_ZPmZ : sve_int_un_pred_arit_bitwise_fp<0b101, "fneg", AArch64fneg_mt>;

and goes back to

class sve_int_un_pred_arit<bits<2> sz8_64, bits<4> opc,
                             string asm, ZPRRegOp zprty>
: I<(outs zprty:$Zd), (ins zprty:$_Zd, PPR3bAny:$Pg, zprty:$Zn),
  asm, "\t$Zd, $Pg/m, $Zn",
  "",
  []>, Sched<[]> {
  bits<3> Pg;
  bits<5> Zd;
  bits<5> Zn;
  let Inst{31-24} = 0b00000100;
  let Inst{23-22} = sz8_64;
  let Inst{21-20} = 0b01;
  let Inst{19}    = opc{0};
  let Inst{18-16} = opc{3-1};
  let Inst{15-13} = 0b101;
  let Inst{12-10} = Pg;
  let Inst{9-5}   = Zn;
  let Inst{4-0}   = Zd;

  let Constraints = "$Zd = $_Zd";
  let DestructiveInstType = DestructiveUnaryPassthru;
  let ElementSize = zprty.ElementSize;
  let hasSideEffects = 0;
}

Note that there are two zs and one p. Probably input z, output z, and predicate.

and (ins zprty:$_Zd, PPR3bAny:$Pg, zprty:$Zn), reads like three inputs.

@sdesmalen-arm @paulwalker-arm

IIRC this is a merging predicated instruction, so the destination register retains its existing values for inactive lanes, and that’s why you see the extra input operand.

def AArch64fneg_mt : SDNode<"AArch64ISD::FNEG_MERGE_PASSTHRU", SDT_AArch64Arith>;

is in the instruction def.

Yes @amara is right. The instruction is predicated and merges the result into the source-and-also-destination operand (you can see they’ve got a tied operand constraint):

  (outs zprty:$Zd), (ins zprty:$_Zd, PPR3bAny:$Pg, zprty:$Zn),
  asm, "\t$Zd, $Pg/m, $Zn",
  ....
  let Constraints = "$Zd = $_Zd";

Basically, the unary op fneg negates the input from $Zn and does a predicated move of the result into the source/destination operand $Zd/$_Zd.

The register allocator must ensure that Zd (the destination) equals _Zd (the source operand into which the result is merged). Because Zd = _Zd, the instruction is printed as opcode $Zd, $Pg/m, $Zn.

We’ve tried to reflect this in the name of the instruction (Pm = merging), as opposed to FNEG_ZPzZ, which zeroes the inactive lanes. FNEG_ZPzZ does not take another vector source operand, because none of it’s lanes would be used (i.e. the inactive lanes would be zeroed).

Tablegen accepts:

def : Pat<(nxv2f64 (fneg nxv2f64:$src)),
          (FNEG_ZPmZ_D ZPR:$src, (PTRUE_D 31), ZPR:$src)>;

The trick is that that the ptrue eliminates all merging.

The trick is that that the ptrue eliminates all merging.

…so has your question been answered?

The downside of re-using $Src for the pass-thru operand rather than IMPLICIT_DEF is that when $Src has more than a single use the register allocator will have to insert explicit movs to retain $Src. With the way the patterns are currently implemented we can benefit from using movprfx, which may be more efficient.

What are you trying to do that you need to add a pattern for fneg?

Exactly. I can try the IMPLICIT_DEF. I/we don’t want to select fneg in C++;

1 Like