[RFC] Moving OCaml bindings to peripheral tier and disabling them by default

Is there an LLVM stance on the in-tree OCaml bindings? They’re turned on by default (-DLLVM_ENABLE_BINDINGS=ON is the default), but only if you have a bunch of ocaml libraries/tools installed.

I broke the OCaml bindings and some bots complained. It was non-trivial to figure out what I had to install to get the bindings building locally.

What do people think about making the OCaml bindings off by default (default to -DLLVM_ENABLE_BINDINGS=OFF) and putting the bindings in the LLVM peripheral tier to be maintained by interested people on a best effort basis?

IMO the bindings are not a core part of LLVM that everybody should strive to keep working, but I’d like other people’s opinions.

3 Likes

This proposal makes sense to me. The OCaml binding is not a core part, and many language bindings exist out of tree. Many people don’t install the OCaml tools and build the binding to notice breakage issue beforehand. There is a significant language barrier for the majority of contributors to help maintain it. I don’t mind that much that OCaml binding being in the tree, but the responsibility can be shifted to those who use the binding.

Yes, they’ve broken a few times in the past and nobody noticed:

@mgorny only realised because we package them for completeness. IIRC there was another problem with them too.

Long story short, a long time ago I was about to split LLVM package in Gentoo, and OCaml bindings did not support building out-of-tree back then. I’ve asked our users if anybody needs them, received a positive response and went on to make them build out-of-tree.

I didn’t know much about building OCaml stuff then (and I don’t remember much of it now), so I’ve mostly relied on help. From what I’ve gathered back then is that the bindings weren’t being installed correctly before. We’ve fixed all that.

I honestly don’t know if anyone is really using these bindings today. However, they definitely are undermaintained and if they move to lower support tier, they’re probably going to die. Not that I do mind, just pointing that out.

That said, perhaps it’d be worth to try reaching out to people who reviewed OCaml related fixes in the past.

I’m in favor of moving them to the peripheral tier, but I think if we reach out to people and can’t find a Code Owner, then we should just drop them completely. I don’t think there is much point having them in tree if no one is maintaining them.

I think @alan did some work on OCaml APIs recently.

Thanks for the ping! I’m currently working on a pretty involved patch to make the bindings compatible with OCaml 5: ⚙ D136400 [llvm-ocaml] Migrate from naked pointers to prepare for OCaml 5

One reason why I decided to learn OCaml was because it was good for writing compilers and had official LLVM bindings. I think it would be a shame if the bindings were relegated to a lower status. I would also feel disappointed because I’ve spent a lot of time working on the patch to port the bindings to OCaml 5.

From the git history, @whitequark was the former code owner of the bindings, but resigned. Since my patch required me to touch pretty much every part of the OCaml bindings, I would be willing to be the new code owner. The only caveat is that I’m not familiar enough with the LLVM project to understand all of the functions that are being bound.

Let me ping @jberdine, who has also worked on the bindings, and @kit-ty-kate, who maintains the OCaml package repository (including tweaking the LLVM OCaml bindings for distribution) and also uses them for her own projects.

Peripheral tier in this case still means they’ll be in-tree and buildable, just not on by default. It’s a tradeoff of LLVM contributors who don’t really have any ocaml expertise needing to understand ocaml and the bindings (and properly install dependencies) to fix breakages, versus a set of people who do understand how to build and test the bindings fixing breakages after they’re broken. I think having just the subcommunity that cares about the ocaml bindings look after them is the correct tradeoff here. As long as there are buildbots that build the ocaml bindings and only notify interested individuals I don’t think it’ll be too much extra work.

I appreciate that there is overhead in setting things up to build the OCaml bindings, and that that causes some friction and drag. I would like to note though that on the other hand there is genuine value that comes from them being in-tree and on by default. While many patches mainly remove code that does not compile any more, some patches by authors who are not primarily focused on the OCaml bindings e.g. add support for new APIs. For one concrete example, see here. I don’t know if anyone who is focused on the OCaml bindings can spend enough time on LLVM to follow core developments closely. So it is very helpful when people who are more familiar with those changes make such changes to the bindings. How the cost/benefit ratio works out, I can’t say objectively, as I rely on the OCaml bindings, but can only dedicate a small fraction of my time to LLVM.

I expect that if the OCaml bindings are not on by default, then they will probably not receive such patches from authors who are not primarily focused on the bindings. I think they would wither and die pretty quickly after that, leading to potentially multiple unofficial out-of-tree bindings.

I would also like to hear from @vaivaswatha and @arbipher to get their views.

I think we should go ahead and disable ocaml bindings by default now. With ocaml 5, and opam install ocamlfind ctypes, check-llvm-tools-bindings fails with many undefined symbol linker errors.

This is bad for these build bots which happen to install ocamlfind and ctypes.

Difficulty to find maintainers and reviewers is a sign of the small subcommunity. I don’t know whether this is sufficient to keep it enabled by default. Consider this: the Bazel build system have many users but it is in the peripheral layer and contributors have no obligation fixing them. That said, many contributors help maintain it, so the usability is actually quite good.

LLVM’s OCaml bindings are useful for a lot of compiler practitioners (both in academia and in the industry), enabling them to quickly write end-to-end compilers in OCaml. I say this after having benefited from it myself. It’s also almost certain that, as @jberdine points out, that if this isn’t on by default, it would just die out.

With @alan offering to be the maintainer for the OCaml bindings, my suggestion would be see how that goes, with us helping as much as we can. If that doesn’t work out then we’ll have no option but to disable the bindings by default.

It isn’t clear to me why having them on by default makes a difference: if you don’t have ocaml correctly setup on your system it won’t be tested / exercised either.

IMO the way to get something maintained in LLVM is:

  1. having a bot that tests the feature.
  2. have a community that can be reactive to help when the bot fails and a contributor needs to fix their patch.
1 Like

As the process of building LLVM with the OCaml bindings activated is fresh in my memory, I can document how to build them and modify the bindings for people unfamiliar with OCaml. I want to get my patch merged first, however, because OCaml 5 changes how the bindings must be written and the code on main does not support OCaml 5 (my patch fixes it).

For full disclosure, my concerns about becoming a code owner are:

  • I am unfamiliar with Arcanist/Phanricator and sometimes make mistakes.
  • I am not familiar with all the LLVM functions in the C API that OCaml binds. I just go by type signatures and Doxygen docs to write the glue code.
  • I am currently recovering from a major surgery, and my physical health is not currently in a state where taking on a responsibility would be prudent. In three to six months, I should have recovered enough to resume normal activities, if there are no complications.
1 Like

My 5-cent as a daily OCaml user is to keep the binding on-the-tree. The Peripheral tier looks reasonable to me. I wonder is there any list of current peripheral components? I agree with @vaivaswatha that there are still users and research ongoing on LLVM as far as I know.

I am glad that @alan offers to maintain the binding code, especially with their current physical health. I can also offer some help before their full recovery. I have some experience playing around with OCaml binding for LLVM and other libraries (z3), but at this moment, I am not clear about the breaking points you mentioned. I need to look at it.

OCaml is used to be steady on the C API and FFI. OCaml 5 makes unavoidable breaking changes to support algebraic effects and multicore. I also wonder if there is better practice to making/maintaining the binding and let me check for that.

Off the topic, I think the breaking is not only to LLVM binding but may affect to more, so (OCaml) people can be more experienced to fix them.

Alternative build systems (ex. GN, Bazel) and related infrastructure.

The only supported build system for LLVM is CMake. There are build files for two other build systems in-tree. There are build bots that test GN and Bazel.

If the OCaml bindings get degraded to the peripheral tier, they will be off by default. No build bot will test them and they will rot over time. If you want to keep the OCaml bindings in-tree, you have to provide a build bot that tests them.