Splitting 'expand' into 'split' and `expand`.

Hello all,

I would like to propose a large change to LLVM that I would be happy to implement.

The instruction selection legalizer knows of three different ways to legalize an action (ignoring an already legal action).

  • Expansion
  • Promotion
  • Custom

Expanding a node will lead to one of two things - the operation will be split into several equivalent smaller operations (i.e. 64-bit mul => two 32-bit multiplications), or it will be expanded into a different operation(s) (i.e. SDIVREM => SDIV + SREM or a runtime library call).

This can be ambiguous - should a 128-bit SDIVREM be expanded into two 64-bit SDIVREMs or a 128-bit libcall.

It would be useful for LLVM to distinguish between these two operations - i.e. instead of LegalizeAction::Expand, we have LegalizeAction::Expand and LegalizeAction::Split.

  • Expand should always expand the node into a different pattern (such as MULHU expanding to MUL and a copy of the top half).

  • Split will always split an operation into several smaller, operations (such as 128-bit addition being split into a 64-bit addition and a 64-bit addition with carry)

Full disclosure: I’m working on a backend on which the pointer type is illegal (Atmel AVR : 8-bit microcontroller with 16-bit pointers). The problem with the ambiguity with expand today is that LLVM implicitly assumes that as a 16-bit register class exists, then 16-bit addition/subtraction should exist and there is no point splitting into two 8-bit operations.

Currently we work around this by defining 16-bit pseudo instructions for every instruction which is not properly supported. This gives us very sub-optimal code and introduces many bugs. More detail about the problem can be found in the old mailing list archives.

I’m not sure if this requires an RFC? It would be a very large breaking change. Perhaps a less drastic solution can be found, like leaving Expand as it is, but also adding LegalizeAction::Split which must expand only using smaller operations.

Again, I would be happy to work on this.

Cheers,
Dylan McKay

Hello all,

I would like to propose a large change to LLVM that I would be happy to
implement.

The instruction selection legalizer knows of three different ways to
legalize an action (ignoring an already legal action).

   - Expansion
   - Promotion
   - Custom

Expanding a node will lead to one of two things - the operation will be
split into several equivalent smaller operations (i.e. 64-bit mul => two
32-bit multiplications), or it will be expanded into a different
operation(s) (i.e. SDIVREM => SDIV + SREM or a runtime library call).

This can be ambiguous - should a 128-bit SDIVREM be expanded into two
64-bit SDIVREMs or a 128-bit libcall.

It would be useful for LLVM to distinguish between these two operations -
i.e. instead of LegalizeAction::Expand, we have LegalizeAction::Expand and
LegalizeAction::Split.

   -

   Expand should always expand the node into a different pattern (such as
   MULHU expanding to MUL and a copy of the top half).
   -

   Split will always split an operation into several smaller, operations
   (such as 128-bit addition being split into a 64-bit addition and a 64-bit
   addition with carry)

Full disclosure: I’m working on a backend on which the pointer type is
illegal (Atmel AVR : 8-bit microcontroller with 16-bit pointers). The
problem with the ambiguity with expand today is that LLVM implicitly
assumes that as a 16-bit register class exists, then 16-bit
addition/subtraction should exist and there is no point splitting into two
8-bit operations.

I think your proposed solution is a bit of overkill for this problem, because
it impacts legalization of all SDNodes, when only a few are actually used
for pointer arithmetic.

Take a look at this thread:

http://comments.gmane.org/gmane.comp.compilers.llvm.devel/87321

It seems like there are some out of tree patches for handling illegal
pointer types. Maybe you can work with David to get these changes
upstream.

-Tom

I think your proposed solution is a bit of overkill for this problem, because
it impacts legalization of all SDNodes, when only a few are actually used
for pointer arithmetic.

I agree with you. I do however think that it has its advantages.

It seems like there are some out of tree patches for handling illegal
pointer types. Maybe you can work with David to get these changes
upstream.

There are a few out of tree patches, but they all work around the problem at hand.
The mailing list posting you linked was in fact made by a contributor of AVR-LLVM :slight_smile:

The solutions described in the posting:

  • Convoluted lowering code (with custom lowering for 16-bit operations)

    it lowers loads and stores during combine1 (before legalization), which is way less than ideal. It also uses 32-bit frameindexes
    (it treats the stack as 32-bit despite global pointers being 64-bit), so I don’t think that provides a>> usable template for a solution here, unfortunately.

  • Considering pointers to be of different types

    If it would help, one of the changes that we’ve made is to add separate PTRADD, INTTOPTR and PTRTOINT nodes to SelectionDAG, which makes it possible to keep pointer types distinct. We currently have
    a iFATPTR type, which is not ideal - we’d like to replace this with a set of PTR32 to PTR256 types. I’d be happy to prioritise extracting these patches for upstreaming if that would be useful to others.

This may not help in our situation, as 16-bit is not a native type but there are still several other operations that can be performed on 16-bit values (not just loads/stores).

This is a problem that has been encountered by several different backends, none of which have
found satisfactory solutions. The proposed fix would simply give a bit more power to backend
maintainers (although admittedly this is necessary for most targets).

Back to your previous point

I think your proposed solution is a bit of overkill for this problem, because
it impacts legalization of all SDNodes, when only a few are actually used
for pointer arithmetic.

The case you make is very strong, as the change proposed would break
dozens of backends.

We could however not cause this problem - a LegalizeAction::Split could be added
with the meaning described earlier - split an operation into several smaller operations.
Expand would continue to have its current meaning and operation - it would either split
or expand into different operations.

The advantage of this is that all backends could specify Expand for illegal operations, but
targets which need to be specifically split could then use Split instead.

This would:

  • Add the micro-optimization of decreasing code paths for ISel on each node. As the number of
    nodes can be in the tens or hundreds of thousands for a compilation, this could be fairly beneficial
  • Current backend maintainers would have the option to change their uses of Expand to Split where
    appropriate in order to be more explicit
  • In the future, if Split became widespread, it would be much easier to change the meaning of Expand if it were to be wanted
  • Cause no breakage

What do you think?

Are you primarily trying to avoid Expand being implemented as a lib call with a larger type? Are there other cases where you want to avoid a different type of expansion besides a call? Before I’ve wanted to have ExpandLibcall be a separate action from Expand, which is somewhat similar idea.

Are you primarily trying to avoid Expand being implemented as a lib call with a larger type?

No - I’m stuck in this situation:

  • 8-bit addition is legal
  • 16-bit addition is illegal, should expand into an add and an add with carry
  • A few operations support 16-bit operands - the 16-bit DREGS register class

The legalizer sees that 16-bit addition should expand. It sees that there is a 16-bit register class,
and it sees that 16-bit values are legal. As 16-bit is legal, it does not try and expand into a smaller type (8-bit).

This causes instruction selection to fail.

I can see why you’d want ExpandLibCall to be a separate action. Our overhanging problem is that
the legalizer only gives very coarse-grained control over the way it legalizes values.

This sounds like basically the same situation AMDGPU has with 64-bit integers. Since you have a 16-bit register class, you avoid terrible problems by not having i16 be a legal type. There is no 64-bit add, and needs to be expanded into 32-bit add + carry. There are fewer 64-bit operations available, but some are. It’s faked by saying 64-bit add is legal and the expand to add + carry is handled during instruction selection, which works and also makes addressing mode matching easier.

It’s faked by saying 64-bit add is legal and the expand to add + carry is handled during instruction selection, which works and also makes addressing mode matching easier.

This is basically what we do currently. So this proposal (or at least something similar) would also help the AMDGPU backend?

I plan on refactoring LegalizeIntegerTypes.cpp to make it more readable and less modular. After this I will submit a patch which adds a Split legalization action which always expands by splitting, and leaving Expand with its current meaning and not breaking any existing code. I believe it would be much more meaningful talking about this change with a prototype.