Massive LLVM-C API break in LLVM 15+ unrelated to pointers

Currently there is a massive removal of LLVMConst* functionality in the LLVM-C API. This is happening without any deprecation whatsoever of the functions.

Here are some of the commits:
https://github.com/llvm/llvm-project/commit/11950efe06822590ff3ee4048df741136c5295bd
https://github.com/llvm/llvm-project/commit/4bb7b6fae3be02031878b2aa3be584c6627ad8ec
https://github.com/llvm/llvm-project/commit/7283f48a05de46fe8721ee6c29b1b6427e7d1a33
https://github.com/llvm/llvm-project/commit/5548e807b5777fdda167b6795e0e05432a6163f1

These are removing things like LLVMConstFAdd and family. So what is the big deal?

  1. Unlike for the C++ API, there is no way to directly manipulate a Value. The only methods available are these LLVMConst* functions. Note that the C API is used as the basis for bridging to non C/C++ languages.
  2. Disregarding the need for the occasional deliberate folding, there is also the problem that creating some values from the LLVM C API is only possible using strings. For example, producing the value 1 << 127 on an 128 bit unsigned int. So it’s not just about constant folding but also conveniently defining constant values.
  3. The suggestion is to use the LLVMBuild functions. There are big problems with this solution (a) most uses of LLVMConst functions are to build global values. There is no builder in the global context. (b) the check that the arguments are indeed const is an important sanity check on the code and describes the intention clearly. While there is a workaround to create a “dummy” builder on the global scope just to get around the need for a builder, this just highlights that the solution is broken.
  4. There is no need to remove these functions in the first place. The IR builder delegates folding to the ConstantFolder class, which doesn’t have any state. It would be trivial to continue supporting the functionality without breaking the API.

So to summarize:

  • There was no deprecation procedure and this breaks pretty much any LLVM-C based backend.
  • The functionality fills a real need that the C++ API doesn’t really need to worry about.
  • The suggested replacement (LLVMBuild) doesn’t work as a replacement.
  • There wasn’t any need to remove them in the first place.

it looks like this clean house has just gotten started, so it will get worse for users of the LLVM-C API.

2 Likes

The release notes state this, so the API has change and one will need to do something else to achieve the same end result.

Extract of release notes:
The following functions for creating constant expressions have been removed,
because the underlying constant expressions are no longer supported. Instead,
an instruction should be created using the LLVMBuildXYZ APIs, which will
constant fold the operands if possible and create an instruction otherwise:

  • LLVMConstExtractValue
  • LLVMConstInsertValue
  • LLVMConstUDiv
  • LLVMConstExactUDiv
  • LLVMConstSDiv
  • LLVMConstExactSDiv
  • LLVMConstURem
  • LLVMConstSRem

Three things to note:

  1. There is no replacement for the functions It states that LLVMBuild* should be used. The problem is that a builder is only valid inside of a basic block, and not in global scope, where LLVMConst methods are typically used. So LLVMBuild is not a replacement.
  2. More functions in line to be removed without deprecation notice. It seems even more functions are removed in LLVM 16. Again, these are not deprecated in LLVM 15, so by neglecting to mention those removals, users of the LLVM-C will again have to deal with a sudden loss of functionality.
  3. The API could be stable There is no reason why this functionality cannot be retained by calling the folding directly.
1 Like

I would have suspected that there is no replacement for these as it else these calls would not have existed in the first place…

Question is more would if have been possible to retain the interface and still do it with the new way llvm does it if the answer is yes. Then it looks like a sabotage of the C api or just lazynes.

There is still time to add functions back to LLVM15 before the release if we decide this is necessary and practical to do. We would just need to do it before -rc3 (Aug 23).

The LLVMConst* APIs are for creating constant expressions, not to invoke constant folding. The underlying constant expressions in the C++ API are gone; there’s no way to reproduce the general functionality of the removed APIs. (You could implement the subset of cases where it does constant folding, but then what would the API do if constant folding fails?)

If you want to propose new C APIs for constructing ConstantInt/ConstantFP/etc. values, or to invoke constant folding, that would be welcome. (Feel free to add me to review any patches.)

(We don’t promise a deprecation period for the C API; the C API is only “stable” in the sense that we won’t modify the signature/semantics of existing C APIs.)

1 Like

@lerno
I am a user like you, so i did not actually do the change.
I think it would help if the release notes contained a link back to the thread or reasoning as to why they were removed.
I don’t remember the details, but I think it was actually for a good reason.
As a user, I moved away from the C-API some time ago, and now use the C++ API. The C-API appears to be an afterthought and misses a lot of the features that the C++ API provides.
It might be worth you at least considering that as an option.

It’s not possible to retain the same C API interface, and @lerno even mentioned some of the reasons. A basic block is required if these operations do not constant fold, since these operations are now instructions, not constants.

The best we could do to maintain C API stability is to transparently call the constant folder and return nullptr if the expression fails to constant fold. This moves the failure from compile time to runtime, which hides the bug and probably creates more problems than it solves.

The C API stability guarantee is documented here and says:

Stability Guarantees: The C API is, in general, a “best effort” for stability. This means that we make every attempt to keep the C API stable, but that stability will be limited by the abstractness of the interface and the stability of the C++ API that it wraps. In practice, this means that things like “create debug info” or “create this type of instruction” are likely to be less stable than “take this IR file and JIT it for my current machine”.

This is exactly the situation that occurred: some operations were removed (div, rem, extract constants), and the corresponding C functions to create those operations were removed.

I’d like to point out that we may still have the option to remove these changes from the release branch (risk of conflicts unclear), but so far as I can tell, we have been following our own policies. I’m sorry they are causing issues for C frontends, but as far as I can tell, this is to be expected.

Maybe I’m missing something but why not use Folder.FoldBinOp(Instruction::URem, LHS, RHS) and friends? It seems to be exactly the missing functionality.

Maybe I’m missing something but why not use Folder.FoldBinOp(Instruction::URem, LHS, RHS) and friends? It seems to be exactly the missing functionality.

What happens if said expression could not be created? Previously, this would create a constant expression. If you used the constant folder, this would result in a null value instead–you’ve changed the semantics of the function, without changing its API or ABI. As Reid suggests, this would probably create more problems than it actually solves: I don’t think any of the other existing C API functions can return null values. Arguably, because of the shift in semantics, this would itself be a break in the C API stability guarantees.

2 Likes

In what cases would it not be created? Maybe I read the code wrong, but from what I see, the only way the Folder can fail is if the arguments aren’t constant. In which case the LLVMConst of today wouldn’t work either:

Value *FoldBinOp(Instruction::BinaryOps Opc, Value *LHS,
                   Value *RHS) const override {
    auto *LC = dyn_cast<Constant>(LHS);
    auto *RC = dyn_cast<Constant>(RHS);
    if (LC && RC) {
      if (ConstantExpr::isDesirableBinOp(Opc))
        return ConstantExpr::get(Opc, LC, RC);
      return ConstantFoldBinaryInstruction(Opc, LC, RC);
    }
    return nullptr;
  }

Imagine the LHS were ptrtoint (@some_global to i64) and RHS were, say, i64 39. Both LHS and RHS would be a Constant, but ConstantFoldBinaryInstruction would return nullptr since there is no longer a urem ConstantExpr.

A nullptr that can be asserted against. I don’t understand, did LLVMConstURem ever support ptrtoint % int?

Yes, it did. It used to create an object of ConstantExpr type with the opcode set to urem. This is no longer possible since this kind of constant expression was removed. As a consequence, the old behavior of LLVMConstURem can no longer be implemented. Since there is an interface break either way, it’s safer to remove the function outright.

As others have pointed out, it is a reasonable option to add a new function to the C API which calls the constant folder and is allowed to return null.