re-[ANN] LLVM C Backend: still able to generate C code from C

In the decades since CBackend’s well-deserved removal, the code hasn’t ever quite died, or ever really gotten much better. Until now!

With the benefit of AI (Claude Sonnet 4.5) to do this unnecessary work of fixing one of the biggest long-standing issues, I’ve finally implemented the clang front-end adaptor target machine: or rather the most common two triples: llvm32-vendor-os-env and llvm64-vendor-os-env (for little endian 32 and 64 bit, respectively, for any OS). These are needed for C code to correctly round trip through this backend. Most importantly, the TargetMachine now instructs clang to preserve the signedness and struct coherency of arguments, so that CBackend can reverse that, and a later C compiler can get the ABI correct. The CBackend already could handle these when present, but clang only chooses to emit them when they are relevant for the selected backend. Thus, the need for a CBackendFrontend was born:

I don’t actually know if or how I’ll be able to merge that, since clang doesn’t have the ability to support out-of-tree targets directly. Currently that support lives either in a patch file in the PR or a branch of my local fork of llvm. In 2012, there was apparently willingness to bring back CBackend as an in-tree target, but in 2026, I don’t actually want to inflict that backend upon the other maintainers. There could also be licensing issues with doing so (the code is licensed under the original LLVM license, but yanked from the llvm tree before the relicensing happened). If anyone has better ideas for how to support this out-of-tree triple in clang and wants to make a PR, let me know. Also, currently this pass still doesn’t handle most struct layout differences, other than the basics of 32 vs 64 pointer sizes, because the data layout used is not customizable. If anyone has ideas for how to give users the option to encode the entire data layout in their triple (or another argument to clang), that would help fix many more ABI issues of struct layout and endianness.

Currently, to use this functionality you do something like this:

$ clang -fplugin ./build/CBackendPlugin.so \
        -target llvm64-linux-gnu \
        -S -emit-llvm input.c -o output.ll
$ llvm-cbe output.ll -o output.cbe.c
$ arch-vendor-os-triple-gcc output.cbe.c -o exe_output -lexisting_c_abi

This PR also currently only applies to clang/clang++, but if anyone wants to make a PR, to my PR, which adds the corresponding target machine definition for other LLVM front-ends (e.g. flang, rust, etc), I’ll be happy to include that too.

1 Like