Status of pre-legalize vector changes

Hello,

Here's a quick update on where I'm at with the pre-legalize vector changes
I'm working on. I hope to have an updated patch ready to a few days, assuming
I don't get too busy with other projects going on. Here are some of the issues
I've come across so far.

Putting the table for extended ValueTypes in SelectionDAG seems quite involved.
There are a lot of places that use the MVT-namespace functions, and making
sure they all have access to a SelectionDAG object in order to use them is a
lot of changes. It's tempting to make the table be a ManagedStatic, in
VMCore/ValueTypes.cpp, not unlike the ManagedStatic objects in VMCore/Type.cpp,
for example.

Also, CopyToReg and CopyFromReg lowering/legalization rely on being able to
create vector-of-vector types. For example, an <8 x double> vector is bitcasted
to (effectively) <4 x <2 x double>> for x86, allowing EXTRACT_ELEMENT to pull
out whole legal vectors at a time instead of just individual elements, for
example. While my original patch made the extended ValueType table explicitly
hold vector element types and vector lengths, I had converted it to hold
Type*, and ran into problems because VectorType doesn't permit the element
types to be vectors. It seems unfortunate to give up using Type* just because
of these few cases, but at the moment I don't have any simpler alternatives.

Dan

Here's a quick update on where I'm at with the pre-legalize vector changes
I'm working on. I hope to have an updated patch ready to a few days, assuming
I don't get too busy with other projects going on. Here are some of the issues
I've come across so far.

Great! I'm going to reorder your email a bit to suit my evil purposes:

Also, CopyToReg and CopyFromReg lowering/legalization rely on being able to
create vector-of-vector types. For example, an <8 x double> vector is bitcasted
to (effectively) <4 x <2 x double>> for x86, allowing EXTRACT_ELEMENT to pull
out whole legal vectors at a time instead of just individual elements, for
example.

Hrm, now that you mention it, I do remember that. I'd file that into a category of "evil hacks", not something that we really want to support. To me, I think it would be much better to fix this directly, rather than complicate your implementation.

In particular, you should be able to introduce a new [V]CONCAT_VECTOR node, which takes two input vectors and yields an output vector that has length equal to the sum of the input vectors. Likewise, instead of the extract_element hack, we should have an EXTRACT_SUBVECTOR node of some sort.

Since no target support these nodes, they would all be expanded by legalize.

Note that this change is logically independent of the rest of your change, so you could do this on mainline, as a first step to getting the bigger change in.

While my original patch made the extended ValueType table explicitly
hold vector element types and vector lengths, I had converted it to hold
Type*, and ran into problems because VectorType doesn't permit the element
types to be vectors. It seems unfortunate to give up using Type* just because
of these few cases, but at the moment I don't have any simpler alternatives.

Vectors of vectors really is a hack. Type*'s seems like a good way to go.

Putting the table for extended ValueTypes in SelectionDAG seems quite involved.
There are a lot of places that use the MVT-namespace functions, and making
sure they all have access to a SelectionDAG object in order to use them is a
lot of changes. It's tempting to make the table be a ManagedStatic, in
VMCore/ValueTypes.cpp, not unlike the ManagedStatic objects in VMCore/Type.cpp,
for example.

The problem with this approach is that it makes it much harder to codegen two different functions in parallel on different threads. This isn't something we do today, but is something we want to do eventually. Requiring synchronization in such a critical datastructure would be badness...

That said, I don't know just how horrible the impact is on the code. Maybe there is another way? Maybe the value table should be a first class datastructure that SelectionDAG contains. This would let you do things like:

SD.getValueTable().getNumVectorElements(VT)

but it would also allow clients to grab a reference to the valuetable and call into it directly if more convenient. Would this help at all?

One nice thing about this is that the ValueTable datastructure would be very simple, and moving all the methods to operate on a value table would be straight-forward. If ValueTable was a simple class, then tblgen could instantiate it (f.e.) and not populate it with any of the funny MVT's.

-Chris

It may be possible to codegen EXTRACT_SUBVECTOR on some targets once subreg support is in.

> Also, CopyToReg and CopyFromReg lowering/legalization rely on being able to
> create vector-of-vector types. For example, an <8 x double> vector is bitcasted
> to (effectively) <4 x <2 x double>> for x86, allowing EXTRACT_ELEMENT to pull
> out whole legal vectors at a time instead of just individual elements, for
> example.

Hrm, now that you mention it, I do remember that. I'd file that into a
category of "evil hacks", not something that we really want to support.
To me, I think it would be much better to fix this directly, rather than
complicate your implementation.

In particular, you should be able to introduce a new [V]CONCAT_VECTOR
node, which takes two input vectors and yields an output vector that has
length equal to the sum of the input vectors. Likewise, instead of the
extract_element hack, we should have an EXTRACT_SUBVECTOR node of some
sort.

Since no target support these nodes, they would all be expanded by
legalize.

Note that this change is logically independent of the rest of your change,
so you could do this on mainline, as a first step to getting the bigger
change in.

Ok, these are now created. The new opcodes aren't very general yet; currently
they're only suitable for pre-legalize use; they're only currently needed for
lowering CopyToReg and CopyFromReg. Once the rest of the pre-legalize vector
changes go in, these opcodes will loose the leading 'V' in their names, and
then it'll be easier to extend them to other purposes.

> While my original patch made the extended ValueType table explicitly
> hold vector element types and vector lengths, I had converted it to hold
> Type*, and ran into problems because VectorType doesn't permit the element
> types to be vectors. It seems unfortunate to give up using Type* just because
> of these few cases, but at the moment I don't have any simpler alternatives.

Vectors of vectors really is a hack. Type*'s seems like a good way to go.

Agreed.

> Putting the table for extended ValueTypes in SelectionDAG seems quite involved.
> There are a lot of places that use the MVT-namespace functions, and making
> sure they all have access to a SelectionDAG object in order to use them is a
> lot of changes. It's tempting to make the table be a ManagedStatic, in
> VMCore/ValueTypes.cpp, not unlike the ManagedStatic objects in VMCore/Type.cpp,
> for example.

The problem with this approach is that it makes it much harder to codegen
two different functions in parallel on different threads. This isn't
something we do today, but is something we want to do eventually.
Requiring synchronization in such a critical datastructure would be
badness...

The badness could be mitigated somewhat with readers-writer locks, because
the common case for this table would be that it would get a few entries added
to it right away and then see very few updates, if any, for the rest of its
lifetime.

I'll take another look at eliminating the static table though.

That said, I don't know just how horrible the impact is on the code.
Maybe there is another way? Maybe the value table should be a first class
datastructure that SelectionDAG contains. This would let you do things
like:

SD.getValueTable().getNumVectorElements(VT)

but it would also allow clients to grab a reference to the valuetable and
call into it directly if more convenient. Would this help at all?

Not really. The problem I saw was clients that don't have the SelectionDAG
passed into them. Passing around handles to this new kind of object everywhere
wouldn't be any easier than passing around handles to the DAG itself.

One nice thing about this is that the ValueTable datastructure would be
very simple, and moving all the methods to operate on a value table would
be straight-forward. If ValueTable was a simple class, then tblgen could
instantiate it (f.e.) and not populate it with any of the funny MVT's.

You mean like MVT::iAny? Those are still just simple enum values; they don't
live in the table.

Dan

Note that this change is logically independent of the rest of your change,
so you could do this on mainline, as a first step to getting the bigger
change in.

Ok, these are now created. The new opcodes aren't very general yet; currently
they're only suitable for pre-legalize use; they're only currently needed for
lowering CopyToReg and CopyFromReg. Once the rest of the pre-legalize vector
changes go in, these opcodes will loose the leading 'V' in their names, and
then it'll be easier to extend them to other purposes.

Very nice, the patches look very clean and one more hack is dead :slight_smile:

The problem with this approach is that it makes it much harder to codegen
two different functions in parallel on different threads. This isn't
something we do today, but is something we want to do eventually.
Requiring synchronization in such a critical datastructure would be
badness...

The badness could be mitigated somewhat with readers-writer locks, because
the common case for this table would be that it would get a few entries added
to it right away and then see very few updates, if any, for the rest of its
lifetime.

True, but you still have cache lines bouncing around between processors :). It's just much better to completely eliminate sharing when possible, particularly when the data structures are logically separate.

I'll take another look at eliminating the static table though.

Thanks. If it gets too nasty, I can be convinced, but it would be very nice to keep these separated.

That said, I don't know just how horrible the impact is on the code.
Maybe there is another way? Maybe the value table should be a first class
datastructure that SelectionDAG contains. This would let you do things
like:

SD.getValueTable().getNumVectorElements(VT)

but it would also allow clients to grab a reference to the valuetable and
call into it directly if more convenient. Would this help at all?

Not really. The problem I saw was clients that don't have the SelectionDAG
passed into them. Passing around handles to this new kind of object everywhere
wouldn't be any easier than passing around handles to the DAG itself.

Ok.

One nice thing about this is that the ValueTable datastructure would be
very simple, and moving all the methods to operate on a value table would
be straight-forward. If ValueTable was a simple class, then tblgen could
instantiate it (f.e.) and not populate it with any of the funny MVT's.

You mean like MVT::iAny? Those are still just simple enum values; they don't
live in the table.

I just mean that *all* MVT related functions could turn into methods on the valuetable for consistency. Right now we have this weird dichotomy between the MVTs that tblgen uses and the MVTs that the codegen stuff uses. tblgen can only use a subset of the functions, it would be nice to eliminate that, but not very important in the grand scheme of things.

-Chris