Thanks, I think it can solve my problem.
But please allow me to explain the hardware in detail. Hope there is
more elegant way to solve it.
The hardware is a "stream processor". That is, It processes samples
one by one. Each sample is associated with several 128-bit
four-element vector registers, namely:
* input registers - the attributes of the sample, the values of the
registers are different and initialized for each sample before
execution. READ-ONLY (can only be declared once by 'dcl' instruction).
* constant registers - sample-invariant. READ-ONLY (can only be
defined once by 'def' instruction). All samples shares the same set of
constant register values.
* general purpose registers - values are not initialized before the
execution and destroyed after execution. They can be read and written.
* output registers - WRITE-ONLY.
Sample program converted to pseudo-LLVM assembly (SSA):
%Vec4 = type < 4 x float>
// declare input registers and
// define constant register values
%v1 = dcl %Vec4 xyz
%v2 = dcl %Vec4 color
%c1 = def %Vec4 <1,2,3,4>
// v1, v2, c1 are not allowed to be destination register
// of any instruction hereafter.
%r1 = add %Vec4 v1, c1
%r2 = mul %Vec4 v1, c2
%o1 = mul %Vec4 r2, v2 // write the output register 'o1'
I planed to partition the register into different RegisterClass:
input, output, general purpose, constant, etc.
def GeneralPurposeRC : RegisterClass<packed, 128, [R0, R1]>;
def InputRC : RegisterClass<packed, 128, [V0, V1]>;
def ConstantRC : RegisterClass<packed, 128, [C0, C1]>;
def ADDgg : BinaryInst<0x51, (
ops GeneralPurposeRC :$dest,
ope GeneralPurposeRC :$src), "add $dest, $src">;
def ADDgi : BinaryInst<0x52, (
ops GeneralPurposeRC :$dest,
ope InputRC :$src), "add $dest, $src">;
def ADDgc : BinaryInst<0x52, (
ops GeneralPurposeRC :$dest,
ope ConstantRC :$src), "add $dest, $src">;
The problem is: SDOperand alwasy return the 'type' of the value (in
this case, 'packed', the first argument of RegisterClass<>), but not
the 'RegisterClass'. With two 'packed' operands, the instruction
selector doesn't know whether a ADDgg, ADDgi, or an ADDgc should be
generated (BuildMI() function).
The same problem exists when there are two types of costant registers,
floating point and integer, and each is declared 'packed' ([4xfloat]
and [4xint]). The instruction selector doesn't know which instruction
it should produce because the newly defined MVT type 'packed' is
always used for all operands (registers), even if it's acutally a
[4xfloat] or [4xint].