Some experiences using LLVM C Backend


I'm interested in LLVM as an opportunity to support C++
programming for legacy MCUs (8051, PIC1x, etc.). Recently, I tried to
use C Backend as means to achieve this. As C Backend was removed in
recent LLVM versions, I started with LLVM 3.0 which was last version to
include it.

For starters, I played with MSP430 target, which is supported by LLVM,
so allows roundtrip experiments (comparing results of C++ -> MSP430
assembly vs C++ -> C -> MSP430 assembly compilation). One of the first
things I saw with C Backend output was main() being declared as
"unsigned" which then caused error when fed to Clang (main() should be
int). Next issue was handling of inline asm. It looked that
the corresponding code in C backend wasn't tested and had few thinkos.
Don't get wrong - I'm glad it was written and it was easy to fix. There
were few other small issues like missing includes (stdint.h).

With that, I was able to achieve perfect roundtrip with trivial by
functionality, but still using few layers of C++ magic (templates and
inline functions) blink example (this one specifically:

, the repository also has Makefiles for LLVM).

My patches to LLVM 3.0 are available at
Commits · pfalcon/llvm · GitHub .

My next step was to try to integrate them into "cbe_revival" patchset as
started by Roel Jordans and available as . I found that this
branch doesn't build OOB (one header changed its location), and then I
was greeted by: "Inline assambler not supported" assertion (note typo
in the word "assembler"). So, please consider reinstating inline asm
support, because otherwise, at least for the usecase discussed, it's
more productive to use LLVM 3.0.

Besides blink.cpp, I so far quickly looked into few other (still pretty
simple) examples - some of them achieve perfect roundtrip, some differ
in arithmetic sequences - that definitely has something to do with C
char -> int promotion, at this time I cannot say if C backend code was
equivalent; some differ in basic block ordering (i.e different BB order
when flattening CFG into instruction stream), some actually differ in
CFG (one issue I spotted with tail duplication of inline asm statements
- hope to post a patch soon).

I hope to do more detailed and formal roundtrip comparisons on the
larger code corpus and report results later (if someone can suggest a
utility to perform fuzzy graph isomorphism checks, that would be