Why the instructions (e.g. frinta, fsqrt, fabs, etc) are defined with three ops when it looks like it should be only two ops (the predicate and the input)?.
It prevents us from correctly implementing the unpredicated lowering of these instructions because we don’t have that extra operand.
The reason these instructions take three operands is because they have merging predication. For example for `fabs z0.s, p0/m, z1.s`, the active lanes of z0 (as defined by p0) are set to fabs(z1), but the inactive lanes are taken from z0. So the operation basically merges the result into the destination operand, which is also the source operand for the inactive lanes.
The instruction is defined in LLVM with operands `(outs zprty:$Zd)` and `(ins zprty:$_Zd, PPR3bAny:$Pg, zprty:$Zn)`. It then has a register constraint so the register allocator will enforce that `$Zd =$_Zd`. For the unpredicated node ISD::FABS, we know that the inactive lanes are irrelevant because all lanes from $Zn are defined in the result. So $_Zd can be IMPLICIT_DEF.
See `multiclass unpred_from_pred_one_op_fp` in https://reviews.llvm.org/D71712 for some examples.
Hope that helps!