[llvm-commits@lists.llvm.org: Re: [llvm] 2dea3f1 - [SVE] Add new VectorType subclasses]

Bringing this up on llvm-dev for more general attention.

The problem here is two fold:

(1) Reuse of enumeration values is just a major no-go.
(2) I'm not sure why the existing vector types had to be killed
completely.

But something clearly has to be done here. This majorly affects e.g.
Mesa.

Joerg

----- Forwarded message from Joerg Sonnenberger via llvm-commits <llvm-commits@lists.llvm.org> -----

Agreed. Those llvm-c changes are wrong, and compatibility needs to be maintained for the numeric values at minimum. Probably also would be a good idea to make LLVMVectorTypeKind a deprecated alias for LLVMFixedVectorTypeKind.

Joerg,

   First of all, here is a link to the original RFC for this that explains the motivation: http://lists.llvm.org/pipermail/llvm-dev/2020-March/139811.html. In short, VectorType used to be a pair of (bool, unsigned), where the bool is true if the vector is scalable. If the bool is false, the unsigned is the exact number of elements in the vector. If the bool is true, the unsigned is the minimum number of elements in the vector. Unfortunately, scalable vectors are a new concept, and there existed a ton of code that assumed that vectors can only be fixed width. Thus, the bool was largely ignored, and bugs were everywhere. The choice was to painstakingly educate everyone, and be constantly vigilant against new bugs, or to introduce a more strongly typed solution. The new scheme is much less error prone because (when the refactor is complete), VectorType will not have a getNumElements() method. Users must either call getElementCount() which returns the pair that they must then explicitly acknowledge, or they must cast to FixedVectorType if they want to call getNumElements(). Either way, at that point if they manage to commit a bug, it will be due to negligence on their part rather than a "you just had to know" situation.

   As for point 1, this is a work in progress. Previously, you could construct a fixed width vector by calling LLVMVectorType(), you could ask if a Type was some sort of vector by checking if it's LLVMTypeKind is LLVMVectorTypeKind, and you could get the unsigned value in the ElementCount by calling LLVMGetVectorSize(). You had no way of knowing if a vector was scalable, or handling it correctly. Now, you can construct a FixedWidthVector by calling LLVMVectorType(), you can check if a vector is a fixed width or scalable vector by inspecting the LLVMTypeKind, and you can correctly handle scalable vectors by correctly interpreting the return value of LLVMGetVectorSize(). The C api is strictly more correct than it was previously. Really, the only thing that must be done to make the API complete is to add a way to construct a scalable vector. In C++, the recommended way of telling if a vector is a scalable vector is to call isa<ScalableVectorType>(Ty), so the current C api already closely resembles this.

   There are only so many hours in the day, and my objective is to fix the issues with the C++ api. I have not spent a tremendous amount of time thinking about how to architect the C api for scalable vectors. That said, I have this patch up https://reviews.llvm.org/D78599 that provides a way to specifically get the bool from a vector, and to construct a scalable vector. If you need to be able to construct scalable vector objects via the C api, then feel free to bring this patch over the finish line or submit a patch that you feel is more in line with typical C api usage patterns.

Thanks,
   Christopher Tetreault

Regarding the numerical value of the LLVMTypeKind enum, my understanding is that LLVM-C does not promise to maintain ABI compatability between versions. If I am mistaken, I can fix this issue.

We can’t guarantee everything. But, the goal is to not break things in the C API, unless necessary. And, when it is necessary, to make sure it’s as obvious as possible of an ABI breakage. E.g., we might delete a function, and create a new one with a new name, instead of incompatibly changing the number of arguments for an existing function name. Or, might leave a gap in enum values where an item is removed.

For your change, modifying the numeric values of the existing constants is an unnecessary change, and a not-obvious ABI-incompatibility. Just add LLVMScalableVectorTypeKind to the end, instead, and it’s fine.

And, since the C API basically doesn’t support scalable vectors yet, and since you aren’t modifying the rest of the C API’s uses of “VectorType” to FixedVectorType (e.g. the LLVMVectorType function is still named that), I’d suggest simply leaving the enum value for Fixed vectors named “LLVMVectorTypeKind” as it was before, instead of renaming it to LLVMFixedVectorTypeKind.

The way I see it, the enum change brought no positive outcome, so
shouldn't be done. It's not because we don't guarantee backward
compatibility that we should trivially break it for no benefit.

ISTM that not supporting scalable vectors properly in the llvm-c API is
no showstopper right now. But the full breakage of the fixed vector API
is a real concern.

Joerg

The benefit of not having the gap is not having to spend a bunch of time and effort on ABI maintenance. I actually left LLVMVectorTypeKind in at first, and was asked to remove it in code review. Then, I left a gap in the enum where LLVMVectorTypeKind used to be and was asked in code review to remove the gap.

As for not exposing the names, I think that would be a mistake. The whole point of refactoring vector types is to expose the fact that scalable vectors exist and have to be dealt with. If a ScalableVectorType object is handed to you from outside of your code, you need to be able to deal with it. Having the ability to correctly handle a scalable vector and having the ability to create one are two separate things. The C api currently has the first ability but not the second. Adding the ability to construct scalable vectors from C still remains to be implemented. There’s a lot of outstanding work. It all needs to get done, but with limited resources, some tasks will get prioritized over others. I would argue that since support for scalable vectors is still a work in progress, you should probably not be constructing them and passing them into it.

It seems to me that you don’t actually care about constructing scalable vectors (otherwise you’d be asking me to complete the work rather than back it out), so the C api should fully support your use case. You should just update your codebase to correctly handle scalable vs fixed width vectors. That way, when the day comes that LLVM fully supports scalable vectors, you will be ready.

It has been pointed out to me that my choice of words could easily be misinterpreted to mean that the people I am responding to do not care about the quality of the code. I’d like to apologize for this.

When I say “It seems to me that you don’t actually care about constructing scalable vectors …”, I mean that to complete your work, you do not need to be able to construct scalable vectors. I do not want to imply that you do not care about the quality of the code, or that this feature works.

Thanks,

Christopher Tetreault

The benefit of not having the gap is not having to spend a bunch of time and effort on ABI maintenance. I actually left LLVMVectorTypeKind in at first, and was asked to remove it in code review. Then, I left a gap in the enum where LLVMVectorTypeKind used to be and was asked in code review to remove the gap.

We attempt not to break ABI and API, if we don’t need to. It is not an ironclad guarantee, because sometimes things may change so fundamentally on the C++ side, so as to make keeping C API compatibility effectively impossible. But it is “best effort” attempt. I’m sorry you got contrary advice from a reviewer.

As for not exposing the names, I think that would be a mistake.

I’m not sure what you mean by this. I want you to expose the new scalable vector kind as LLVMScalableVectorTypeKind (as you have), and put it at the end of the list to not change existing enum values. I want you to leave the existing C API name LLVMVectorTypeKind as referring to the renamed on the C++ side ‘Type::FixedVectorTyID’.

If, later on, it seems useful to also use the new “FixedVectorType” terminology in the C API, that can be done. When doing so, be consistent, and also rename the constructor from “LLVMVectorType” to “LLVMFixedVectorType”. Additionally, please leave the old names as deprecated aliases. It doesn’t feel to me that such a renaming is really worthwhile, though, and I’d suggest it best to just leave the old names alone.

The whole point of refactoring vector types is to expose the fact that scalable vectors exist and have to be dealt with. If a ScalableVectorType object is handed to you from outside of your code, you need to be able to deal with it. Having the ability to correctly handle a scalable vector and having the ability to create one are two separate things. The C api currently has the first ability but not the second. Adding the ability to construct scalable vectors from C still remains to be implemented. There’s a lot of outstanding work. It all needs to get done, but with limited resources, some tasks will get prioritized over others. I would argue that since support for scalable vectors is still a work in progress, you should probably not be constructing them and passing them into it.

I agree, it’s fine that you haven’t implemented full scalable vector support yet in the C API.

It seems to me that you don’t actually care about constructing scalable vectors (otherwise you’d be asking me to complete the work rather than back it out), so the C api should fully support your use case. You should just update your codebase to correctly handle scalable vs fixed width vectors. That way, when the day comes that LLVM fully supports scalable vectors, you will be ready.

I don’t have a codebase affected by this, I care because users out in the wide world do have such codebases, and we have a policy of not breaking them unless it’s necessary. If you’d like to do more work and fully implement the proper LLVM-C interfaces to support scalable vectors, that’s fine too. But I’m not asking for that, because it would be new functionality, rather than simply not breaking old functionality.