RFC: More AVX Experience

Ok, so I've been chugging away at AVX and added some new
features in TableGen to facilitate writing generic patterns.

Here's an example:

//===----------------------------------------------------------------------===//
// Dummy defs for writing generic patterns
//===----------------------------------------------------------------------===//

def SRCREGCLASS;
def DSTREGCLASS;
def MEMCLASS;
def SRC1CLASS;
def SRC2CLASS;
def ADDRCLASS;
def INTRINSIC;
def TYPE;
def INTTYPE;
def MEMOP;

// TYPE - The data type (f32 for SS, f64 for SD, etc.)
// SRCREGCLASS - The source register class (VR128, FR32, etc.)
// DSTREGCLASS - The destination register class
// MEMCLASS - The memory classe (f32mem, f64mem, etc.)
// SRC1CLASS - The first source object class (register or memory, depending)
// SRC2CLASS - The second source object class (register or memory,
depending)
// DSTCLASS - The destination object class (register or memory, depending)
// ADDRCLASS - Either 'addr' or REGCLASS, depending
// MEMOP - Either 'memop' or 'srcvalue,' depending

// Scalar
defm FsANDN : sse1_sse2_avx_binary_scalar_xs_xd_node_pattern_rm_rrm<
              0x55,
              "andn",
  [[(set DSTREGCLASS:$dst,
     (INTTYPE (and (not (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
                   (INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>;

// Vector
defm ANDN : sse1_sse2_avx_binary_vector_tb_ostb_node_pattern_rm_rrm<
            0x55,
            "andn",
  [[(set DSTREGCLASS:$dst,
     (INTTYPE (and (vnot (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
                   (INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>;

The "not" vs. "vnot" is unfortunate. I could add another class argument that
says "instantiate with members of this list of operators" but see below about
arguments and the combinatorial explosion problem. That and the fact that we
have no "foreach for subclass specification" makes this difficult to do.

(Thinking about this some more, a "cross product" operator [list x list] ->
[list] could work.)

In any case, the lower classes take care of substituting the appropriate
symbols based on the specific instruction generated ([v]PS, [V]PD, etc.).

I still don't know how to capture the hierarchy under
sse1_sse2_avx_binary_scalar_xs_xd_node_pattern_rm_rrm and other such
higher-level classes. Right now it's generated by a Perl script but Chris
isn't enamored of that solution. I think it can be better as well.

One thought I had was to implement a "copy arguments" feature in TableGen so
we could do something like this:

defm FsANDN : sse1_binary_scalar_xs_node_pattern_rm<
              0x55,
              "andn",
  [[(set DSTREGCLASS:$dst,
     (INTTYPE (and (not (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
                   (INTTYPE (MEMOP ADDRCLASS:$src2)))))]]>,
           sse2_binary_scalar_xd_node_pattern_rm<''>,
           avx_binary_scalar_xs_node_pattern_rm<''>,
           avx_binary_scalar_xd_node_pattern_rm<''>;

where "''" (two apostrophes) is a mnemonic for the "ditto" mark used in
English (and other languages?).

This way we could define fewer base classes because we wouldn't have to
define intermediate base classes that just serve to aggregate other classes
in order to get us down to one class and thus one argument specification.

But there would still be a lot of classes to manually define. Here's an
incomplete list:

sse1_unary_scalar_xs_node_rm; // For generic unary
sse1_unary_scalar_xs_node_pattern_rm; // To use custom patterns
sse1_unary_scalar_xd_node_intrinsic_rm; // With an intrinsic
sse1_unary_scalar_xd_node_pattern_intrinsic_ipattern_rm; // Custom patterns

sse1_binary_scalar_xs_node_rm; // Binary

plus the rest of the sse1 "xs rm" classes, the mr encodings, all the binary
operations, all the sse2 classes (which look like the sse1 classes except they
use "xd", all the vector classes, all the AVX classes, LRBni, etc. We still
have a combinatorial explosion problem.

Of course, we only have to define the ones we actually use and that cuts
down significantly on the numbers, but it's still large.

So I'm still looking for a complete solution. Ideas welcome.

                             -Dave

plus the rest of the sse1 "xs rm" classes, the mr encodings, all the binary operations, all the sse2 classes (which look like the sse1 classes except they use "xd", all the vector classes, all the AVX classes, LRBni, etc. We still have a combinatorial explosion problem.

Of course, we only have to define the ones we actually use and that cuts
down significantly on the numbers, but it's still large.

So I'm still looking for a complete solution. Ideas welcome.

Ah, what early '30's jazz can do to spur the creative juices.

So I'm driving through south Minneapolis listening to some Fletcher
Henderson (try it sometime!) and it dawns on me that all we need to
handle customer patterns is some way to override the default behavior of
the "default" pattern. I think we can do that with some of the new
operations I added.

I'm thinking we could have an optional final argument that contains a
list of key:value pairs that specify which defaults to override. For
example:

def PATTERN : Key;

// Scalar
defm FsANDN : sse1_sse2_avx_binary_scalar_xs_xd_node_rm_rrm<
               0x55,
               "andn",
               nop, // Or some dummy dag operator
   [[PATTERN [[(set DSTREGCLASS:$dst,

      (INTTYPE (and (not (INTTYPE (bitconvert (TYPE SRCREGCLASS:$src1)))),
                    (INTTYPE (MEMOP ADDRCLASS:$src2)))))]]]]>;

Now we pass a list of one pair: [PATTERN <our-custom-pattern>]. This
assumes that lists can be heterogeneous. I don't know if that's true
but I'll try it out. Alternatively, we could define a new 'tuple' type
to handle this. Actually, since we have to declare a list with only one
element type, I'm pretty sure we will have to do this, unless it's
possible to declare a list of typeless objects. Which would be
preferable?

Hmm...as I go through this it seems that even with tuples, we can't get
a list of heterogeneous tuples so maybe we have to resort to multiple
extra arguments. That would be too bad because if we wanted to override
one default we might have to specify the other defaults that come before
it. Unless we do named parameters, which I don't want to worry about if
we don't have to (I was trying to address it with key:value pairs).

Down in the guts of the class hierarchy we'd have something like this:

class SomeBaseClass<Opcode opc, string asmop, SDNode dagoperator,
                     list<dag> CustomPattern = []> :
    Inst<opc, <some-asm-string>, !if(!null(CustomPattern),
                                     <default-pattern>, CustomPattern)>;

Hmm, I think that will even work.

Now there's an additional problem if we need different patterns for
various encodings (rr, rm, etc.). I'm hoping we can write generic
enough patterns that we can avoid problems most of the time and for the
rest of the cases we may have to write special classes.

By getting rid of the default/custom pattern classes we'll reduce the
class space quite a bit. Maybe even to something tractable. I'll
play around with this idea and see where it goes.

Does anything look really wrong with this approach?

                                  -Dave