libclang or libtooling for transpiler

Hello. First time here, no tomatoes please.

Long story short, I want to attempt to make a cpp to rust transpiler. I know it’s a very big task with a lot of work, but let’s ignore that for a moment.

Like any sane person who doesn’t want to parse cpp himself (or any text format for that matter), I’m counting on clang to be able to give me enough information (the whole AST?) so that I’d be able to translate it.

From what I can see my options are the following:

  1. libtooling, which from what I understand can do „everything”, but is somewhat unstable;

  2. libclang, which is a C interface to the AST. So basically a stable C API to libtooling.

I would just go with libtooling to be safe in a normal circumstance, but I was really hoping that I could make the transpiler in rust, and using libtooling in this case would be problematic as its API is CPP and I would have to make bindings to rust for everything.

So, from what I can see, my options here are:

  1. libtooling + cpp, not ideal because I was hoping I would be able to do it in rust;

  2. libclang + rust, would be great but I don’t know if it’ll be enough;

  3. libclang + rust + extensions, so basically I’d extend the libclang API myself (somehow?) where I would need it;

  4. libtooling + rust, hopefully not.

So, here’s the question(s): Will libclang suffice my need? Which option would you consider is the best?

Thanks,

Andrei.

Hi Andrei,

The C++ API for AST traversal is actually stable in practice (it does not promise a stable ABI, but you don’t need that if you implement a C++ library).

I would highly advise to go with the C++ APIs.

I've created a tool that does basically the same thing. It generate D bindings to C and Objective-C libraries. This tool is written in D and is using libclang [1]. It's been working for me so far, but I haven't got to the point of trying to translate C++ headers yet.

This tool is only focusing on generating bindings. That means only the header files are translated, not full C/Objective-C code with function bodies with statements and expressions.

The current libclang bindings DStep is using are generated with DStep itself. The initial bindings were created by had. It was very easy to do, just some search and replace. libclang contains the most forgiving header files, from this point of view, I have ever seen.

[1] http://github.com/jacob-carlborg/dstep

Forgot the summary:

I chose libclang because:

* It's a stable API
* It's what's recommend if using other language than C++
* I really, really did not want to use C++. After all, I'm using D, not C++ :slight_smile:

I would say it depends on how much you don't want to use C++. If I were you, I would definitely give it a try in Rust.

We had the exact same question when we started work on C2Rust. We
ended up using the libtooling API with C++ to extract the AST and
serialize it into Rust. We found that the libclang interface was
insufficient to extract all information we needed for the transpiler.
If I recall correctly, bindgen is hitting up against similar issues
since they use libclang. Rather than try to extend the C API, we just
used the libtooling interface to get direct access to all the internal
Clang data structures. We've supported LLVM 6-8 under this approach
without significant compatibility issues or churn.

We originally chose to separate the AST exporter (c2rust-ast-exporter)
from the rest of the transpiler as a standalone tool, so it serializes
the AST into a CBOR format. I would recommend not using this approach,
and we're planning to remove it. In the future, the AST exporter will
interface more directly with Rust and build a Rust representation of
the AST and all necessary metadata, rather than serializing it back
and forth to a format like CBOR.

I'd be happy to discuss C(++) to Rust transpiling further if you'd
like, although much of that discussion is probably not relevant to
cfe-dev.

- stephen

We had the exact same question when we started work on C2Rust. We
ended up using the libtooling API with C++ to extract the AST and
serialize it into Rust.

Oh, I can add that I don't think libtooling existed when I first started with DStep.

We found that the libclang interface was
insufficient to extract all information we needed for the transpiler.

It would be interesting to know more about this.

Apologies for the very extremely late response. These last weeks (and the next few) are just very busy.

From what I can tell libtooling + cpp is my best bet. So, I compiled clang and after a few tries I got a working example on printing all the function names in a cpp. I think it’s a good start.

And I would really like to discuss transpiling with you, Stephen Crane. How may I contact you?

IIRC c2rust does something like this, have you already looked at that project for inspiration?

http://llvm.org/devmtg/2018-10/talk-abstracts.html#lt5

https://github.com/immunant/c2rust