Hello!
There are such code fragments in my MLIR code:
Value in1addin2 = builder.create<Arith::AddFOp>(loc, in1, in2);
Value in1subin2 = builder.create<Arith::SubFOp>(loc, in1, in2);
After this MLIR code has been lowered down to LLVM and then translated into X86-ASM instructions, the above-mentioned operation pares AddFOp/SubFOp turn into the following ASM code snippets:
14272d: c4 c1 1c 5c da vsubps %ymm10, %ymm12, %ymm3
142732: c5 d4 58 f0 vaddps %ymm0, %ymm5, %ymm6
142736: c5 fc 5c c5 vsubps %ymm5, %ymm0, %ymm0
14273a: c5 a4 58 ef vaddps %ymm7, %ymm11, %ymm5
But, you know, X86 (and other CPUs) instruction set contains coupled ADD/SUB vector instructions. How can I get single vaddsubps instruction in code instead of vaddps/vsubps couple?
Is it possible to insert such an operation (coupled ADD/SUB) into X86Vector or Arith dialect with the support of appropriate coupled ADD/SUB LLVM instruction?