[RFC] Scalable Vector Constants Splats

I’m proposing a change to the way scalable vector constant splats are represented. Today non-zero cases are forced to use ConstantExpr, via the same insert and shuffle idiom used for non-constant vector splats. This is unnecessarily complex and a barrier to the effort to remove most all ConstantExpr operations[1].

From a textual IR point of view I’m proposing something akin to <vscale x N x Ty> <Ty #imm, …> as the representation of constant scalable vector splats. Textually this looks trivial to support but the C++ representation is a little more complicated. Two options spring to mind:

  1. Make ConstantVector support scalable vectors.

Pros: Existing type so no significant plumbing (e.g. LLVMContext) required.
Cons: Existing code largely assumes the type contains a fixed number of items that can be iterated across.

  1. Create a new ConstantVectorSplat container.

Pros: New type so no legacy code to worry about.
Cons: Yet another Constant container type.

Questions:

  1. Am I correct in wanting to move away from a ConstantExpr based implementation?
  2. Are there significant out-of-tree uses of ConstantVector that would make a change of interface unpalatable?
  3. Are there other options I should consider.
  4. What about fixed-length vectors? The IR for a 512-bit i1 vector splat is pretty long so is there appetite to unify the textual representation across fixed-length and scalable vectors?

Note:

I’m not proposing this as a route to support scalable vector globals. These will remain illegal as they are today.

[1] [RFC] Remove most constant expressions

Between those two options, I’d go for ConstantVectorSplat. ConstantVector and ConstantVectorSplat will have different representation (the latter only containing a single element), so I think they should be separate.

For consumers this should be mostly transparent, because they should be using Constant::getSplatValue() anyway.

An alternative suggested in [RFC] Remove most constant expressions - #24 by nhaehnle was to allow ConstantInt/ConstantFP to have vector type, acting as a splat in that case. This is nice in some ways, but certainly the most intrusive solution.

Yes :slight_smile: Apart from all the usual reasons, the fact that scalable vector splats use constant expressions means that we often refuse to perform optimizations on them. git grep m_ImmConstant | wc -l says there’s at least 80 optimizations that don’t get applied to scalable vector splats.

I think so. The in-memory representation is going to be different, but it makes sense to support the same syntax for scalable and fixed vectors.

On the syntax bikeshed, I’m going to throw <vscale x 4 x i32> splat (i32 -1) out there, which would be the way we’d normally spell a constant expression.

From a pure human readability perspective, having a concise textual description for a scalable splat constant would definitely help.

I vote that large fixed length constants should get the same syntactic sugar as scalable vectors.

I like @nikic’s suggested <vscale x 4 x i32> splat (i32 -1) syntax.

The only meta question I’d ask here is, do we actually need vector splat constants at all? Or should these be instruction sequences - possibly with some syntactic sugar on a splat instruction - instead?

While I raise the question, I do want to advocate for not blocking your work on the answer. Even if your proposal ends up being “just” a stepping stone towards another result, I think being able to get rid of the shufflevector constant expression is valuable enough on it’s own to justify this work.

We don’t strictly need scalable vector splat constants in the sense that vector splat constants can only show up inside functions, so we can always insert an instruction somewhere. But a lot of optimizations assume it’s possible to construct a splat constant, so I think we wouldn’t want to go down that path unless the plan is to completely get rid of vector constants.

Thanks for the feedback. I like the idea of using ConstantInt/ConstantFP, especially given we already use their get interface for vector types. I’ll dig into that a bit more to identify any pain points.

I created [LLVM][IR] Add native vector support to ConstantInt & ConstantFP. by paulwalker-arm · Pull Request #74502 · llvm/llvm-project · GitHub for the initial IR parsing/print support. There looks like a longish tail of work required to port all vector types. Mainly places that expect a scalar type. However, this looks like just work rather than a significant reason not to extend ConstantInt/ConstantFP.

Given my primary objective is to move away from ConstantExpr I’m likely to prioritise the scalable vector code paths, plus the fixed length handling is more about consistency rather than anything more fundamental.

This all assumes we remain happy with the approach.