<cc'ing llvmdev, please don't email me directly>
Your project sounds quite exciting and I may be interested to use it instead of gcc.
Well, let's me explain why I'm interested in this project :
I have a PlayStation Portable and like to develop several projects using all of its potential.
PSP has two Allegrex cpus : SC (system control) and ME (Media Engine).
Both are designed with a MIPS 2 core plus a dozen instructions borrowed from MIPS4.
Both also have a standard FPU coprocessor.
But only SC processor has a second coprocessor called VFPU which allows matrix and vector computations.
Actually, gcc doesn't handle VFPU at all (i'm speaking about the compiler, not the assembler).
I tried to add the necessary things to allow gcc to handle them but I have only a half success because of the way gcc handles registers and a few other messing things.
So i'm looking for an alternative : llvm-gcc.
First, what is VFPU :
it is a great coprocessor which allows us to handle up to 128 32-bit float registers as a scalar or as a vector of two, three or four elements or as a square matrix of 4, 9 or 16 elements.
- a scalar : S000 = [$0], S001, [$1], S002 = [$2], S003 = [$3], S100 = [$4], ..., S010 = [$32], ..., S020 = [$64], ..., S733 = [$127]
- a column : C000.p = [S000, S010], C020.p = [S020, S030], ..., C000.t = [S000, S010, S020], C010.t = [S010, S020, S030], ..., C000.q = [S000, S010, S020, S030]
- a row : R000.p = [S000, S001], R002.p = [S002, S003], R000.t = [S000, S001, S002], R001.t = [S001, S002, S003], R000.q = [S000, S001, S002, S003]
- a matrix : M000.p = [R000.p, R010.p], M020.p = [R020.p, R030.p], ..., M000.t = [R000.t, R010.t, R020.t], ..., M000.q = [R000.q, R010.q, R020.q, R030q]
- a transposed matrix : E000.p = [C000.p, C001.p], M020.p = [R020.p, R030.p], ..., M000.t = [R000.t, R010.t, R020.t], ..., M000.q = [R000.q, R010.q, R020.q, R030q]
(NOTE : <t><#m><#r><#c>.<s> : t = S (scalar)/C (column)/R (row)/M (matrix)/E (transposed matrix), #m = matrix, #r = row, #c = column, s = p (2d vector), t (3d vector), q (4d vector))
now, if you want just a 3d dot product, just issue this instruction for instance : vdot.t S000.s, C100.t, C110.t
or a matrix product : vmmul.q M000.q, M100.q, E200.q
As you can see, that VFPU has a great potential but, alas, it is not very well exploited by the open source psp-gcc.
Now, lets's speak about llvm-gcc.
1) Does a (uncompleted) MIPS target exist for llvm-gcc ? or must I create one ?
Bruno just submitted one, he's the right person to talk to. I have not had a chance to review the code, so it has not been committed yet, but it should be soon.
2) Do you think that llvm-gcc can handle those combinaisons of registers :
2.1) Can llvm-gcc handle vector float as a super-set of scalar registers ?
2.2) Can llvm-gcc handle matrix float as a super-set of vector float ?
LLVM has good support for vectors, if you treat matrices as large vectors, you should have no problem.
3) Can llvm-gcc vectorize float operations ?
llvm-gcc supports explicitly vectorized code, but we have no autovectorization pass yet.