[RFC] Refactor class hierarchy of VectorType in the IR

Hi,

I am helping with the effort to implement scalable vectors in the codebase in order to add support for generating SVE code in the Arm backend. I would like to propose a refactor of the Type class hierarchy in order to eliminate issues related to the misuse of SequentialType::getNumElements(). I would like to introduce a new class FixedVectorType that inherits from SequentialType and VectorType. VectorType would no longer inherit from SequentialType, instead directly inheriting from Type. After this change, it will be statically impossible to accidentally call SequentialType::getNumElements() via a VectorType pointer.

Background:

Recently, scalable vectors have been introduced into the codebase. Previously, vectors have been written in IR, where n is a fixed number of elements known at compile time, and ty is some type. Scalable vectors are written where vscale is a runtime constant value. A new function has been added to VectorType (defined in llvm/IR/DerivedTypes.h), getElementCount(), that returns an ElementCount, which is defined as such in llvm/Support/TypeSize.h:

class ElementCount {

public:

unsigned Min;

bool Scalable;

}

Min is the minimum number of elements in the vector (the “n” in ), and Scalable is true if the vector is scalable (true for , false for ) The idea is that if a vector is not scalable, then Min is exactly equal to the number of vector elements, but if the vector is scalable, then the number of vector elements is equal to some runtime-constant multiple of Min. The key takeaway here is that scalable vectors and fixed length vectors need to be treated differently by the compiler. For a fixed length vector, it is valid to iterate over the vector elements, but this is impossible for a scalable vector.

Discussion:

The trouble is that all instances of VectorType have getNumElements() inherited from SequentialType. Prior to the introduction of scalable vectors, this function was guaranteed to return the number of elements in a vector or array. Today, there is a comment that documents the fact that this returns only the minimum number of elements for scalable vectors, however there exists a ton of code in the codebase that is now misusing getNumElements(). Some examples:

Auto *V = VectorType::get(Ty, SomeOtherVec->getNumElements());

This code was previously perfectly fine but is incorrect for scalable vectors. When scalable vectors were introduced VectorType::get() was refactored to take a bool to tell if the vector is scalable. This bool has a default value of false. In this example, get() is returning a non-scalable vector even if SomeOtherVec was scalable. This will manifest later in some unrelated code as a type mismatch between a scalable and fixed length vector.

for (unsigned I = 0; I < SomeVec->getNumElements(); ++I) { … }

Previously, since there was no notion of scalable vectors, this was perfectly reasonable code. However, for scalable vectors, this is always a bug.

With vigilance in code review, and good test coverage we will eventually find and squash most of these bugs. Unfortunately, code review is hard, and test coverage isn’t perfect. Bugs will continue to slip through as long as it’s easier to do the wrong thing.

One other factor to consider, is that there is a great deal of code which deals exclusively with fixed length vectors. Any backend for which there are no scalable vectors should not need to care about their existence. Even in Arm, if Neon code is being generated, then the vectors will never be scalable. In this code, the current status quo is perfectly fine, and any code related to checking if the vector is scalable is just noise.

Proposal:

In order to support users who only need fixed width vectors, and to ensure that nobody can accidentally call getNumElements() on a scalable vector, I am proposing the introduction of a new FixedVectorType which inherits from both VectorType and SequentialType. In turn, VectorType will no longer inherit from SequentialType. An example of what this will look like, with some misc. functions omitted for clarity:

class VectorType : public Type {

public:

static VectorType *get(Type *ElementType, ElementCount EC);

Type *getElementType() const;

ElementCount getElementCount() const;

bool isScalable() const;

};

class FixedVectorType : public VectorType, public SequentialType {

public:

static FixedVectorType *get(Type *ElementType, unsigned NumElts);

};

class SequentialType : public CompositeType {

public:

uint64_t getNumElements() const { return NumElements; }

};

In this proposed architecture, VectorType does not have a getNumElements() function because it does not inherit from SequentialType. In generic code, users will call VectorType::get() to obtain a new instance of VectorType just as they always have. VectorType implements the safe subset of functionality of fixed and scalable vectors that is suitable for use anywhere. If the user passes false to the scalable parameter of get(), they will get an instance of FixedVectorType back. Users can then inspect its type and cast it to FixedVectorType using the usual mechanisms. In code that deals exclusively in fixed length vectors, the user can call FixedVectorType::get() to directly get an instance of FixedVectorType, and their code can remain largely unchanged from how it was prior to the introduction of scalable vectors. At this time, there exists no use case that is only valid for scalable vectors, so no ScalableVectorType is being added.

With this change, in generic code it is now impossible to accidentally call getNumElements() on a scalable vector. If a user tries to pass a scalable vector to a function that expects a fixed length vector, they will encounter a compilation failure at the site of the bug, rather than a runtime error in some unrelated code. If a user attempts to cast a scalable vector to FixedVectorType, the cast will fail at the call site. This will make it easier to track down all the places that are currently incorrect, and will prevent future developers from introducing bugs by misusing getNumElements().

Outstanding design choice:

One issue with this architecture as proposed is the fact that SequentialType (by way of CompositeType) inherits from Type. This introduces a diamond inheritance in FixedVectorType. Unfortunately, llvm::cast uses a c-style cast internally, so we cannot use virtual inheritance to resolve this issue. Thus, we have a few options:

Hi Chris,

Guarding against future bugs through type-safety is a welcome
initiative, thank you!

Proposal:

                In order to support users who only need fixed width vectors, and to ensure that nobody can accidentally call getNumElements() on a scalable vector, I am proposing the introduction of a new FixedVectorType which inherits from both VectorType and SequentialType. In turn, VectorType will no longer inherit from SequentialType. An example of what this will look like, with some misc. functions omitted for clarity:

class VectorType : public Type {
public:
  static VectorType *get(Type *ElementType, ElementCount EC);
  Type *getElementType() const;
  ElementCount getElementCount() const;
  bool isScalable() const;
};

class FixedVectorType : public VectorType, public SequentialType {
public:
  static FixedVectorType *get(Type *ElementType, unsigned NumElts);
};

class SequentialType : public CompositeType {
public:
  uint64_t getNumElements() const { return NumElements; }
};

[snip]

Outstanding design choice:

                One issue with this architecture as proposed is the fact that SequentialType (by way of CompositeType) inherits from Type. This introduces a diamond inheritance in FixedVectorType. Unfortunately, llvm::cast uses a c-style cast internally, so we cannot use virtual inheritance to resolve this issue. Thus, we have a few options:

[snip]

2. Remove CompositeType and break SequentialType’s inheritance of Type. Add functions to convert a SequentialType to and from Type. The conversion functions would work the same as those in option 1 above. Currently, there exists only one class that derives directly from CompositeType: StructType. The functionality of CompositeType can be directly moved into StructType, and APIs that use CompositeType can directly use Type and cast appropriately. We feel that this would be a fairly simple change, and we have a prototype implementation up at https://reviews.llvm.org/D75660 (Remove CompositeType class)

Pros: Removing CompositeType would simplify the type hierarchy. Leaving SequentialType in would simplify some code and be more typesafe than having a getSequentialNumElements on Type.
Cons: The value of SequentialType has also been called into question. If we wanted to remove it, now would be a good time. Conversion functions add complexity to the design. Introduces additional casting from Type.

3. Remove CompositeType and SequentialType. Roll the functions directly into the most derived classes. A helper function can be added to Type to handle choosing from FixedVectorType and ArrayType and calling getNumElements():

static unsigned getSequentialNumElements() {
  assert(isSequentialType()); // This already exists and does the
                              // right thing
  if (auto *AT = dyn_cast<ArrayType>(this))
    return AT->getNumElements();
  return cast<FixedVectorType>(this)->getNumElements();
}

A prototype implementation of this strategy can be found at https://reviews.llvm.org/D75661 (Remove SequentialType from the type heirarchy.)

Pros: By removing the multiple inheritance completely, we greatly simplify the design and eliminate the need for any conversion functions. The value of CompositeType and SequentialType has been called into question, and removing them now might be of benefit to the codebase
Cons: getSequentialNumElements() has similar issues to those that we are trying to solve in the first place and potentially subverts the whole design. Omitting getSequentialNumElements() would add lots of code duplication. Introduces additional casting from Type.

Are the issues of getSequentialNumElements() really bigger than those
of cast<SequentialType>(foo)->getNumElements()?

FWIW, the removal of CompositeType is small enough that I'm between
option 2 and 3, personally.

Cheers,
Nicolai

Hi Chris,

Thanks for writing this up! I strongly support the proposal to add a FixedVectorType class to distinguish that type from the more generic (possibly scalable) VectorType. By having the code-base operate on 'FixedVectorType' rather than 'VectorType', we can gradually work to upgrade the code-base to support scalable vectors. This avoids bugs and it seems right conceptually.

On these three options, my first thought was "can we start by breaking FixedVectorType away from SequentialType" (adding a separate 'getNumElements()' method to FixedVectorType), until I figured this wouldn't be that much different from D75661. SequentialType will at that point be a pointless layer on top of ArrayType, so they could be squashed. It would however leave ArrayType and StructType as two independent types under CompositeType (is this an option 4?)

I'd be in favour of removing SequentialType and CompositeType altogether. They seem little used in practice and the places where they are used seem like they can be relatively easily updated to distinguish ArrayType, StructType and FixedVectorType separately.

Even when it requires some code duplication, I prefer being more explicit in distinguishing these types (option 3), over adding conversion functions between Type and Composite/SequentialType (options 1 and 2), especially when the conversion may not always be safe. I'm concerned that requiring conversion functions makes the code less readable, like this example from D65486:

  if (auto *STy = dyn_cast_or_null<llvm::SequentialType>(
                          llvm::CompositeType::get(OrigTy, false)))

Here the use of CompositeType::get(Type*) in the context of dyn_cast_or_null<llvm::SequentialType> seems a bit obscure to me.

If we are to choose for option 3, I'd suggest removing the interface to 'Type::getSequentialNumElements()' entirely, and replacing it by explicit `Type::getFixedVectorNumElements()` and `Type::getArrayNumElements()`, thus removing any methods that mimic the old design.

Thanks,

Sander

Sander De Smalen via llvm-dev <llvm-dev@lists.llvm.org> writes:

Even when it requires some code duplication, I prefer being more
explicit in distinguishing these types (option 3), over adding
conversion functions between Type and Composite/SequentialType
(options 1 and 2), especially when the conversion may not always be
safe. I'm concerned that requiring conversion functions makes the code
less readable, like this example from D65486:

  if (auto *STy = dyn_cast_or_null<llvm::SequentialType>(
                          llvm::CompositeType::get(OrigTy, false)))

Here the use of CompositeType::get(Type*) in the context of
dyn_cast_or_null<llvm::SequentialType> seems a bit obscure to me.

If we are to choose for option 3, I'd suggest removing the interface
to 'Type::getSequentialNumElements()' entirely, and replacing it by
explicit `Type::getFixedVectorNumElements()` and
`Type::getArrayNumElements()`, thus removing any methods that mimic
the old design.

+1.

                      -David

Sander,

   Thank you for your reply, allow me to address some of your points:

* Regarding the conversion functions

   We discussed it internally and our conclusion was that my CompositeType::get() and CompositeType::is() might be unpalatable to the community. We think it might be possible to specialize the casting templates such that cast(), dyn_cast(), and isa() work. Code like

  if (auto *STy = dyn_cast_or_null<llvm::SequentialType>(
                          llvm::CompositeType::get(OrigTy, false)))

... represents a more egregious case of this. But if I can get cast working, this will become

if (auto *Sty = dyn_cast<llvm::SequentialType>(OrigTy))

... which is much nicer. If I we can make this work, then the conversions will be just as safe as they ever were. Unfortunately, accomplishing this requires writing some pretty painful template code, and it's not really documented. The cast documentation calls out clang::Decl and clang::DeclContext as an example to emulate, but provides no further guidance. I suppose this might be a good exercise. Alternatively, I could add a SequentialType::get() and SequentialType::is(), and bypass the dyn_cast_or_null() call in your example. It's just a bunch of boilerplate I didn't want to do for potentially throw-away prototype code.

* Regarding just breaking FixedVectorType away from SequentialType, but leaving ArrayType a subclass

    I think this is not a good option. We will still have to rewrite all code that is generic over FixedVectorType and ArrayType, so we gain nothing, and the amount of work is likely the same, in addition to the drawbacks that you mentioned.

* Regarding using option 3 without getSequentialNumElements()

   This will result in a bunch of code that looks like this:

if (auto *ArrTy = dyn_cast<ArrayType>(Ty))
   doSomething(ArrTy->getNumElements(), Foo);
else
   doSomething(cast<FixedVectorType>(Ty)->getNumElements(), Foo);

   I count 8 places in https://reviews.llvm.org/D75661 where we call getSequentialNumElements(). 8 isn't _that many_ places, but it's enough to be annoying. In the resulting branches, it would be doing literally the same thing; it just screams code duplication. I think I may have been a bit melodramatic about claiming it "subverts the design." Realistically, the implementation of getSequentialNumElements() never tries to cast to VectorType, only ArrayType and FixedVectorType, so it will assert or return a garbage value at runtime. It also only calls getNumElements(), so it won't work on a scalable vector. I suppose the implementation of cast uses a c-style cast, which will eventually resort to a reinterpret_cast, so it may happen that the data layout of a FixedVectorType and VectorType are such that the VectorType's ElementCount::Min and FixedVectorType::NumElements are at the same offset. I don't think we should defensively handle this situation; either we accept that UB exists, or we reject the idea of getSequentialNumElements(). However, I assume enough people develop with asserts enabled where this won't be an issue.

   My personal preference is that we keep getSequentialNumElements() if we choose to go with option 3.

Thanks,
   Christopher Tetreault

Hi,

I just wanted to ping the mailing list on this RFC. So far, feedback has been pretty limited. Regarding the open question of how to handle the SequentialType subclass, I have received one vote of removing it entirely, and not having a getSequentialNumElements() function.

Are there any other thoughts on this topic?

Thank you,

Christopher Tetreault

Hi,
I am helping with the effort to implement scalable vectors in the codebase in order to add support for generating SVE code in the Arm backend. I would like to propose a refactor of the Type class hierarchy in order to eliminate issues related to the misuse of SequentialType::getNumElements(). I would like to introduce a new class FixedVectorType that inherits from SequentialType and VectorType. VectorType would no longer inherit from SequentialType, instead directly inheriting from Type. After this change, it will be statically impossible to accidentally call SequentialType::getNumElements() via a VectorType pointer.

I’m sorry that I missed this thread when you posted it. I’m very much in favor of changing the type hierarchy to statically distinguish fixed from scalable vector types, but I think that making VectorType the common base type is unnecessarily disruptive. Practically speaking, this is going to break every out-of-tree frontend, backend, or optimization pass that supports SIMD types. Relatively little LLVM code will just naturally support scalable vector types without any adjustment. Following the principle of iterative development, as well as just good conservative coding practice, it’s much better for code that does support both to explicitly opt in by checking for and handling the more general type, rather than being implicitly “volunteered” to support both by having the VectorType type semantically repurposed out from under them.

I understand the argument that VectorType is a better name for the abstract base type, but in this case I don’t think that consideration justifies the disruption for the vast majority of LLVM developers. There are plenty of names you could give the abstract base type that adequately express that’s a more general type, and the historical baggage of VectorType being slightly misleadingly named if you’re aware of a particular largely-vendor-specific extension does not seem overbearing.

John.

Hi John,

I’d like to address some points in your message.

Practically speaking, this is going to break every out-of-tree frontend, backend, or optimization pass that supports SIMD types.

My understanding is that the policy in LLVM development is that we do not let considerations for downstream and out-of-tree codebases affect the pace of development. The C++ API is explicitly unstable. I maintain a downstream fork of LLVM myself, so I know the pain that this is causing because I get to fix all the issues in my internal codebase. However, the fix for downstream codebases is very simple: Just find all the places where it says VectorType, and change it to say FixedVectorType.

… by having the VectorType type semantically repurposed out from under them.

The documented semantics of VectorType prior to my RFC were that it is a generalization of all vector types. The VectorType contains an ElementCount, which is a pair of (bool, unsigned). If the bool is true, then the return value of getNumElements() is the minimum number of vector elements. If the bool is false, then it is the actual number of elements. My RFC has not changed these semantics. It will eventually delete a function that has been pervasively misused throughout the codebase, but the semantics remain the same. You are proposing a semantic change to VectorType to have it specifically be a fixed width vector.

… a particular largely-vendor-specific extension …

All SIMD vectors are vendor specific extensions. Just because most of the most popular architectures have them does not make this not true. AArch64 and RISC-V have scalable vectors, so it is not just one architecture. It is the responsibility of all developers to ensure that they use the types correctly. It would be nice if the obvious thing to do is the correct thing to do.

… it’s much better for code that does support both to explicitly opt in by checking for and handling the more general type …

This is how it will work. I am in the process of fixing up call sites that make fixed width assumptions so that they use FixedVectorType.

I think that it is important to ensure that things have clear sensible names, and to clean up historical baggage when the opportunity presents. The advantage of an unstable API is that you are not prevented from changing this sort of thing by external entities. Now is the time to fix the names of the vector types; a new type of vector exists, and we have the choice to change the names to reflect the new reality or to accumulate some technical debt. If we wait to address this issue later, there will just be more code that needs to be addressed. The refactor is fairly easy right now because pretty much everything is making fixed width assumptions. The changes are all fairly mechanical. If we wait until scalable vectors are well supported throughout the codebase, the change will just be that much harder.

Thanks,

Christopher Tetreault

Hi John,

I’d like to address some points in your message.

Practically speaking, this is going to break every out-of-tree frontend, backend, or optimization pass that supports SIMD types.

My understanding is that the policy in LLVM development is that we do not let considerations for downstream and out-of-tree codebases affect the pace of development.

This is not categorically true, no. When we make changes that require large-scale updates for downstream codebases, we do so because there’s a real expected benefit to it. For the most part, we do make some effort to keep existing source interfaces stable.

The C++ API is explicitly unstable. I maintain a downstream fork of LLVM myself, so I know the pain that this is causing because I get to fix all the issues in my internal codebase. However, the fix for downstream codebases is very simple: Just find all the places where it says VectorType, and change it to say FixedVectorType.

… by having the VectorType type semantically repurposed out from under them.

The documented semantics of VectorType prior to my RFC were that it is a generalization of all vector types. The VectorType contains an ElementCount, which is a pair of (bool, unsigned). If the bool is true, then the return value of getNumElements() is the minimum number of vector elements. If the bool is false, then it is the actual number of elements. My RFC has not changed these semantics. It will eventually delete a function that has been pervasively misused throughout the codebase, but the semantics remain the same. You are proposing a semantic change to VectorType to have it specifically be a fixed width vector.

Probably 99% of the code using VectorType semantically requires it to be a fixed-width vector. The generalization of VectorType to scalable vector types was a representational shortcut that should never have been allowed; it should always have used a different type. Honestly, I’m not convinced your abstract base type is even going to be very useful in practice vs. just having a getVectorElementType() accessor that checks for both and otherwise returns null.

… a particular largely-vendor-specific extension …

All SIMD vectors are vendor specific extensions. Just because most of the most popular architectures have them does not make this not true. AArch64 and RISC-V have scalable vectors, so it is not just one architecture. It is the responsibility of all developers to ensure that they use the types correctly. It would be nice if the obvious thing to do is the correct thing to do.

I’m not saying that we shouldn’t support scalable vector types. I’m saying that the burden of generalization should usually fall on the people who need to use the generalization, and otherwise we should aim to keep APIs stable when there’s no harm to it.

… it’s much better for code that does support both to explicitly opt in by checking for and handling the more general type …

This is how it will work. I am in the process of fixing up call sites that make fixed width assumptions so that they use FixedVectorType.

Are you actually auditing and testing them all to work for scalable vector types, or are you just fixing the obvious compile failures? Because scalable vector types impose some major restrictions that aren’t imposed on normal vectors, and the static type system isn’t going to catch most of them.

I think that it is important to ensure that things have clear sensible names, and to clean up historical baggage when the opportunity presents.

“Vector” has a traditional and dominant meaning as a fixed-width SIMD type, and the fact that you’ve introduced a generalization doesn’t change that. Clang supports multiple kinds of pointer, but we still reserve clang::PointerType for C pointers instead of making it an abstract superclass, thus letting our sense of logic introduce a million bugs through accidental generalization throughout the compiler.

You have resigned yourself to doing a lot of work in pursuit of something that I really don’t think is actually an improvement.

John.

Chris, As I have been following this thread, it looks like you could get close to a win by keeping VectorType with your additional changes. Whereas a firm position against VectorType tends to make a win in doubt. Your changes could sit there for some time.

The question does not appear to be that your changes are not ideal in an eventual sense, but that going from where we are now to that eventual condition needs to be done carefully, done incrementally. The solution appears to be how to get agreement on an incremental path. How to consolidate what gain you can with a first step and then seeing what can be done next.

Neil Nelson

John,

This is not categorically true, no. When we make changes that require large-scale updates for downstream codebases, we do so because there’s a real expected benefit to it. For the most part, we do make some effort to keep existing source interfaces stable.

While I’m at a loss to find a documented policy, I recall this thread (http://lists.llvm.org/pipermail/llvm-dev/2020-February/139207.html) where this claim was made and not disputed. The expected real benefits to this change are: 1) The names now match the semantics 2) It is now statically impossible to accidentally get the fixed number of elements from a scalable vector and 3) It forces everybody to fix their broken code. If we provided stability guarantees to downstream and out-of-tree codebases, then I might not agree that 3 is a benefit, but my understanding is that we do not provide this guarantee.

… Probably 99% of the code using VectorType semantically requires it to be a fixed-width vector.

This code is all broken already. I don’t think supporting common misuse of APIs in a codebase that does not provide stability guarantees is something we should be doing.

The generalization of VectorType to scalable vector types was a representational shortcut that should never have been allowed

I agree. Unfortunately it happened, and our choices are to fix it or accept the technical debt forever.

the burden of generalization should usually fall on the people who need to use the generalization, and otherwise we should aim to keep APIs stable when there’s no harm to it.

The burden of creating the generalization should be placed on those who need it, I agree. However, once the generalization is in place, the burden is on everybody to use it correctly. The reason I’ve undertaken this refactor is because I found myself playing whack-a-mole with people adding new broken code after the fact. The previous API was very easy to misuse, so I don’t blame those people.

Are you actually auditing and testing them all to work for scalable vector types, or are you just fixing the obvious compile failures?

Everywhere that VectorType::getNumElements() is being called, we are either changing the code to cast to FixedVectorType, or we are updating the code to handle scalable vectors correctly. If the call site does not have test coverage with scalable vectors, this test coverage is being added. Even for obviously correct transformations such as VectorType::get(SomeTy, SomeVecTy->getNumElements())VectorType::get(SomeTy, SomeVecTy->getElementCount()), I have been required in code review to provide test coverage. We are taking this seriously.

“Vector” has a traditional and dominant meaning as a fixed-width SIMD type, and the fact that you’ve introduced a generalization doesn’t change that. Clang supports multiple kinds of pointer, but we still reserve clang::PointerType for C pointers instead of making it an abstract superclass, thus letting our sense of logic introduce a million bugs through accidental generalization throughout the compiler.

Various languages implemented on top of LLVM have various different pointer types. However, the LLVM IR language only has one pointer type. The LLVM IR language has two types of vectors, and I think it’s reasonable to model them as a class hierarchy in this manner. I’m not familiar with the clang::PointerType situation, so I cannot pass judgement on it.

Thanks,

Christopher Tetreault

John,

This is not categorically true, no. When we make changes that require large-scale updates for downstream codebases, we do so because there’s a real expected benefit to it. For the most part, we do make some effort to keep existing source interfaces stable.

While I’m at a loss to find a documented policy, I recall this thread (http://lists.llvm.org/pipermail/llvm-dev/2020-February/139207.html) where this claim was made and not disputed. The expected real benefits to this change are: 1) The names now match the semantics 2) It is now statically impossible to accidentally get the fixed number of elements from a scalable vector and 3) It forces everybody to fix their broken code. If we provided stability guarantees to downstream and out-of-tree codebases, then I might not agree that 3 is a benefit, but my understanding is that we do not provide this guarantee.

… Probably 99% of the code using VectorType semantically requires it to be a fixed-width vector.

This code is all broken already. I don’t think supporting common misuse of APIs in a codebase that does not provide stability guarantees is something we should be doing.

The generalization of VectorType to scalable vector types was a representational shortcut that should never have been allowed

I agree. Unfortunately it happened, and our choices are to fix it or accept the technical debt forever.

Perhaps this is part of the root of our disagreement. I consider scalable vector types to be an experimental/unstable feature, as many features are when they’re first added to the compiler. I have much lower standards for disrupting the early adopters of features like that; if scalable vectors need to be split out of VectorType, we should just do it.

Analogously, when we upstream the pointer-authentication feature, we will be upstreaming a rather flawed representation that I definitely expect us to revise after the fact. That will be problematic for early adopters of LLVM’s pointer authentication support, but that’s totally acceptable.

John.

John,

For the last several months, those of us working on the scalable vectors feature have been examining the codebase, identifying places where llvm::VectorType is used incorrectly, and fixing them. The fact is that there are many places where VectorType is correctly taken to be the generic “any vector” type. getNumElements may be being called, but it’s being called in accordance with the previously documented semantics. There are many places where it isn’t as well, and many people add new usages that are incorrect.

This puts us in an unfortunate situation: if we were to take your proposal and have VectorType be the fixed width vector type, then all of this work is undone. Every place that has been fixed up to correctly have VectorType be used as a universal vector type will now incorrectly have the fixed width vector type being used as the universal vector type. Since VectorType will inherit from BaseVectorType, it will have inherited the getElementCount(), so the compiler will happily continue to compile this code. However, none of this code will even work with scalable vectors because the bool will always be false. There will be no compile time indication that this is going on, functions will just start mysteriously returning nullptr. Earlier this afternoon, I set about seeing how much work it would be to change the type names as you have suggested. I do not see any way forward other than painstakingly auditing the code.

On the other hand, creating a new fixed width vector type has a clear upgrade path. 1) delete getNumElements() from the base class locally. 2) try to build 3) fix the failures, uploading patches for these fixes 4) once step 3 is completed throughout the codebase, merge the patch to remove getNumElements() from VectorType. Downstream and out-of-tree codebases have this upgrade path as well. There exists no easy upgrade path if we go the other way and have VectorType become a specifically fixed-width vector type.

Basically, I believe that the best thing to do is to move forward with the type names that I have proposed:

Hi John,

Not sure if I've missed this, but is there a reason that Chris' suggestion to do a mechanical search/replace of VectorType -> FixedVectorType wouldn't work for your local changes? Given that your downstream changes interpret VectorType as a fixed vector type, it should be safe to do that. It will also be an effort that you'll only need to do once when you next merge with upstream master, after that you can just use FixedVectorType in your downstream repo.

Chris has been doing some great work to split out fixed-width vectors from scalable vectors which is an important step towards preparing LLVM IR for scalable vectors. The bulk of mechanical changes have been done and a bunch of us are now working to ensure LLVM IR handles/distinguishes these vector types correctly. It would be a pity and great setback if that would be a wasted effort.

We have a fortnightly SVE/SVE2 sync-up call with people in the LLVM community interested in scalable vectors where we normally discuss these things. If you want to join the call next Thursday [1], it seems like a fair topic to bring up in that meeting?

Thanks,

Sander

[1] https://docs.google.com/document/d/1bkS8OKxAyawk6MVKrYkpYRqJtpw2iKGmHFUVWEDSyTQ

Hi John,

Not sure if I've missed this, but is there a reason that Chris' suggestion to do a mechanical search/replace of VectorType -> FixedVectorType wouldn't work for your local changes? Given that your downstream changes interpret VectorType as a fixed vector type, it should be safe to do that. It will also be an effort that you'll only need to do once when you next merge with upstream master, after that you can just use FixedVectorType in your downstream repo.

Chris has been doing some great work to split out fixed-width vectors from scalable vectors which is an important step towards preparing LLVM IR for scalable vectors. The bulk of mechanical changes have been done and a bunch of us are now working to ensure LLVM IR handles/distinguishes these vector types correctly. It would be a pity and great setback if that would be a wasted effort.

I’m not asking for that to be a wasted effort at all. I’m asking for fairly mechanical changes to some of your mechanical changes.

We have a fortnightly SVE/SVE2 sync-up call with people in the LLVM community interested in scalable vectors where we normally discuss these things. If you want to join the call next Thursday [1], it seems like a fair topic to bring up in that meeting?

If you’re willing to hold off that long, sure.

John.

John,

For the last several months, those of us working on the scalable vectors feature have been examining the codebase, identifying places where llvm::VectorType is used incorrectly, and fixing them. The fact is that there are many places where VectorType is correctly taken to be the generic “any vector” type. getNumElements may be being called, but it’s being called in accordance with the previously documented semantics. There are many places where it isn’t as well, and many people add new usages that are incorrect.

This puts us in an unfortunate situation: if we were to take your proposal and have VectorType be the fixed width vector type, then all of this work is undone. Every place that has been fixed up to correctly have VectorType be used as a universal vector type will now incorrectly have the fixed width vector type being used as the universal vector type. Since VectorType will inherit from BaseVectorType, it will have inherited the getElementCount(), so the compiler will happily continue to compile this code. However, none of this code will even work with scalable vectors because the bool will always be false. There will be no compile time indication that this is going on, functions will just start mysteriously returning nullptr. Earlier this afternoon, I set about seeing how much work it would be to change the type names as you have suggested. I do not see any way forward other than painstakingly auditing the code.

If you define getElementCount() = delete in VectorType, you can easily find the places that are doing this and update them to use VectorBaseType. You wouldn’t actually check that in, of course; it’s a tool for doing the audit in a way that’s no more painstaking than what you’re already doing with getNumElements(). And in the meantime, the code that you haven’t audited — the code that’s currently unconditionally calling getNumElements() on a VectorType — will just conservatively not trigger on scalable vectors, which for most of LLVM is a better result than crashing if a scalable vector comes through until your audit gets around to updating it.

Additionally, for those who disagree that the LLVM developer policy is to disregard the needs of downstream codebases when making changes to upstream, I submit that not throwing away months of work by everybody working to fix the codebase to handle scalable vectors represents a real expected benefit. I personally have been spending most of my time on this since January.

I’m responding to this as soon as I heard about it. I’ll accept that ideally I would have seen it when you raised the RFC in March, although in practice it’s quite hard to proactively keep up with llvmdev, and as a community I think we really need to figure out a better process for IR design. I’m not going to feel guilty about work you did for over a month without raising an RFC. And I really don’t think you have in any way wasted your time; I am asking for a large but fairly mechanical change to the code you’ve already been updating.

But most of your arguments are not based on how much work you’ve done on your current audit, they’re based on the fact that scalable vectors were initially implemented as a flag on VectorType. So part of my problem here is that you’re basically arguing that, as soon as that was accepted, the generalization of VectorType was irreversible; and that’s a real problem, because it’s very common for early prototype work to not worry much about representations, and so they stumble into this kind of problematic representation.

My concern is really only ~50% that this is going to force a lot of unnecessary mechanical changes for downstream projects and 50% that generalizing VectorType to include scalable vectors, as the initial prototype did, is the wrong polarity and makes a lot of existing code broken if it ever sees a scalable vector. Your hierarchy change only solves this in the specific case that there’s an immediate call to getNumElements().

Of course, if the community generally disagrees with me that this is necessary, I’ll accept that. But right now I’m just hearing from people that are part of your project.

John.

John description is maybe slightly more conservative than how I’ve been seeing it: there is always a tension there and some balance to find, but we usually err on the side of doing what’s best upstream, while avoiding “unnecessary churn”. Even without over-emphasizing downstream, other folks upstream are working on patches and too much churn on the main APIs makes everyone else’s on-going patches harder to carry over.

In the case of this feature, the angle seems maybe more that this is introducing something experimental and we don’t how much extra churn will occur until this stabilize. With this angle the priority is to reduce the disruption on the “stable” long-lived VectorType: it isn’t immutable but you don’t want to change it every other months for the new couple of years.

Ultimately I’d rather still consider what makes the most sense upstream for the design (where do we want to be in two years) and try to get there. This does not require to preserve the APIs, but the chose path to get should likely favor doing “some” extra churn/detour in introducing the new concepts when it reduces significantly the churn to the “stable” existing classes/API. That may fit what John considered under “we do make some effort to keep existing source interfaces”? This is how I understand the approach for “pointer-authentication feature” that was mentioned as an example.

Hopefully this makes sense?

Thanks again for all the refactoring work to introduce SVE Chris! This is no small task :slight_smile:

Best,

(reply inline)

Just to chime in here, I generally agree with Eli’s framing here.

Additionally, I, as an out of tree user, don’t really care about what names we end up with. Pick what makes sense for the project, we’ll adapt.

Personally, I have no problem with the current state in tree. I’d prefer a bit of migration support in the public C API, but we don’t guarantee perfect stability, so the fact that downstream users might need to change doesn’t particularly bug me.

Philip

Mehdi,

I keep hearing that scalable vectors are experimental, and I don’t think that this is the case. The initial introduction of the type as a flag on VectorType was the experimental first step. Now we are taking the step where we consider how the initial step played out, and adapt with a well thought out solution.

This patch makes sure that the meaning of the long lived “stable” fixed width vector type is not impacted by scalable vectors, or some other future vector type. If we do this change now, it will be done and we won’t have this problem in the future. The goal of this patch is to minimize future churn. If “99% of external code only cares about fixed width vectors”, then this code can be secure that it’s use of the type called FixedVectorType will not be impacted by other types of vectors because the work was done to separate it out into its own thing.

Thanks,

Christopher Tetreault