I'm working on enhancing TableGen's type checking and it triggered with
a problem in CellSPU's specification:
XSHWv4i32: (set VECREG:v8i16:$rDest, (sext:v8i16 VECREG:v4i32:$rSrc))
It's complaining that v4i32 is not smaller than v8i16, which is true in
the sense of vector bit size, and true in the sense of vector element
size. To me, a sign extension from i32 to i16 makes no sense.
From the .td file, it looks as if src and dest types have been swapped:
class XSHWVecInst<ValueType in_vectype, ValueType out_vectype>:
XSHWInst<(outs VECREG:$rDest), (ins VECREG:$rSrc),
[(set (out_vectype VECREG:$rDest),
(sext (in_vectype VECREG:$rSrc)))]>;
multiclass ExtendHalfwordWord {
def v4i32: XSHWVecInst<v4i32, v8i16>;
The multiclass name leads me to believe this was supposed to sign extend
from i16 to i32 but the XSHWVecInst class takes the types in SRC -> DST
order, not DST <- SRC order.
Is this pattern as intended, or did I find a real problem?
-Dave
David Greene wrote:
class XSHWVecInst<ValueType in_vectype, ValueType out_vectype>:
def v4i32: XSHWVecInst<v4i32, v8i16>;
Is this pattern as intended, or did I find a real problem?
Looks like a bug to me. xshw (extend signed half-word(16bits) to
word(32bits)) takes a v8i16 and produces a v4i32. This has likely gone
unnoticed as there is only one type of vector register class (i.e.
VECREG) that is used for all vectors.
Nice catch
Are there more of these?
kalle
Kalle Raiskila <kalle.raiskila@nokia.com> writes:
Looks like a bug to me. xshw (extend signed half-word(16bits) to
word(32bits)) takes a v8i16 and produces a v4i32. This has likely gone
unnoticed as there is only one type of vector register class (i.e.
VECREG) that is used for all vectors.
Nice catch
Are there more of these?
I don't know. I stopped implementing the stricter typechecking when I
saw this. I wanted to make sure there wasn't some official trickery
going on. 
-Dave
It's not official trickery. It's just the way things need to be done on Cell.
-scooter
It's intentional. Everything on Cell is a vector, with the exception of loads and stores. Unless you really want to write code that determines the exact vector element that needs to be changed and do all of the juggling to modify that element. There are no individual registers.
If it's easier to flip the order of the operands, then do so. That's just style.
-scooter