CLR or C++/CLI interface to IR building API

Hi

Our front end is written in a CLR language, and we're currently
interacting with the middle/back-end by writing out .ll files. This
was convenient to get started with, but they're getting to a "huge and
unwieldy" stage now.

I was wondering if anyone's attempted writing proxy/wrapper C++/CLI
classes so that the IR API can be used directly from managed
languages. Any tips/pointers/code/stories of horrible failure?

thanks,
scott

Take a look at this page. It might give you more information:

  http://vmkit.llvm.org/

-bw

Hi, thanks for the pointer.

From what I can tell, while vmkit targets VMs, the front-end is

written in C++, not in a managed language. Am I missing something? (or
was the "might" as vaguely hopeful as it sounded? :slight_smile:

scott

Our front end is written in a CLR language, and we're currently interacting with the middle/back-end by writing out .ll files. This was convenient to get started with, but they're getting to a "huge and unwieldy" stage now.

Yup. This is in the FAQ now: Frequently Asked Questions (FAQ) — LLVM 16.0.0git documentation

I was wondering if anyone's attempted writing proxy/wrapper C++/CLI classes so that the IR API can be used directly from managed languages.

LLVM has C bindings which you should be able to P/Invoke straightforwardly. A rational managed API could be built atop these. Visit include/llvm-c in the source tree. These were specifically designed for use via FFIs like P/Invoke.

Several bindings have been built atop the C bindings (Ocaml, Haskell, D, and Python that I know of), but only the Ocaml ones are on trunk. We would welcome additional bindings into mainline if you are inclined to contribute.

These bindings are not 100% complete, but your usage case has the best coverage.

Any tips/pointers/code/stories of horrible failure?

I would advise against the Managed C++ route. In my experience, the pointy edges of the unmanaged environment (no GC, no memory safety) compound the pointy edges of the managed environment (finalization, asynchronous exceptions)—the end result is the worst of both worlds.

I wouldn't really advise trying to P/Invoke LLVM's C++ APIs directly, either.

There is a recurring 'first/next/prev/last' pattern in the bindings. It allows efficiently implementing values similar to Module::iterator, Function::iterator, et al. The Ocaml bindings include a functional interpretation of this pattern. Each language seems to have a different idiom for handling iteration.

— Gordon

LLVM has C bindings which you should be able to P/Invoke
straightforwardly. A rational managed API could be built atop these.
Visit include/llvm-c in the source tree. These were specifically
designed for use via FFIs like P/Invoke.

Thanks Gordon, I'll see what I can do with llvm-c.

If we decide to go this way (it'll be a relatively large change from
generating text I expect), I'll try to make sure things are kept
separate so the bindings can be contributed back.

I wouldn't really advise trying to P/Invoke LLVM's C++ APIs directly,
either.

Good to know, I'd started writing proxy "ref class" classes, but it
was starting to get a bit icky with multiple inheritance and abstract
bases. I think the straight C version makes more sense.

thanks,
scott