The only multiplication instruction on my target CPU is multiply-and-accumulate. The result goes into a special register that can destructively read at the end of a sequence of multiply-adds. The following sequence is required to so a simple multiply:
acc r0 # clear accumulator, discarding its value (r0 reads as 0, and sinks writes)
mac rSRC1, rSRC2 # multiply sources, store result in accumulator
acc rDEST # fetch accumulator value to rDEST
What's the best way to model simple MUL as this 3-insn sequence in the LLVM backend?
Should the internal accumulation register be explicitly modeled as its own register class with a pattern to copy its value to a general register?
Is it possible to code the three insn sequence in TableGen alone, or must I resort to custom C++ code?
G
The only multiplication instruction on my target CPU is
multiply-and-accumulate. The result goes into a special register that
can destructively read at the end of a sequence of multiply-adds. The
following sequence is required to so a simple multiply:
acc r0 # clear accumulator, discarding its value (r0 reads as 0,
and sinks writes)
mac rSRC1, rSRC2 # multiply sources, store result in accumulator
acc rDEST # fetch accumulator value to rDEST
What's the best way to model simple MUL as this 3-insn sequence in the
LLVM backend?
Should the internal accumulation register be explicitly modeled as its
own register class with a pattern to copy its value to a general register?
Is it possible to code the three insn sequence in TableGen alone, or
must I resort to custom C++ code?
Probably c++ code. You want to lower it into the target node followed by a copyfromreg from the accumulator to a virtual register.
Evan
Evan Cheng wrote:
The only multiplication instruction on my target CPU is
multiply-and-accumulate. The result goes into a special register that
can destructively read at the end of a sequence of multiply-adds. The
following sequence is required to so a simple multiply:
acc r0 # clear accumulator, discarding its value (r0 reads as 0,
and sinks writes)
mac rSRC1, rSRC2 # multiply sources, store result in accumulator
acc rDEST # fetch accumulator value to rDEST
What's the best way to model simple MUL as this 3-insn sequence in the
LLVM backend?
Should the internal accumulation register be explicitly modeled as its
own register class with a pattern to copy its value to a general register?
Is it possible to code the three insn sequence in TableGen alone, or
must I resort to custom C++ code?
Probably c++ code. You want to lower it into the target node followed by a copyfromreg from the accumulator to a virtual register.
Can I model the accumulator register as reset-on-read in LLVM? For GCC, this
would abstractly be (parallel [(set result accum) (set accum 0)]). Naive code for
a sequence of muls { c = a * b; f = d * e; } is this:
acc r0
mul rA, rB
acc rC
acc r0 ### redundant
mul rD, rE
acc rF
If I can't inform LLVM of the side-effect of zeroing accum when it is read, then
do I need to write a target-specific pass, or is there some easier/general way?
G