Flag for generating LLVM IR from forward declarations

Hi everyone,

I was playing around with the LLVM toolchain and it occurred to me that, should I decide to target LLVM IR, it would be neat to have Clang generate declarations from C header files for easier FFI. Someone seems to have had my same idea [1] [2]. This is kind of the opposite of what other projects have gone for, where they use libclang to parse header files to achieve the same result [3].

Now, forward declaration generation is doable by patching CodeGenModule.cpp (I kinda have a PoC working locally, except it generates all the forward declarations, instead of just the ones from the header files I care about)

Is this something that you guys would be interested in merging into Clang through a flag? Or what would the proper way to do this be?
I saw there’s a -femit-all-decls, but it doesn’t really emit all declarations.

Apologies if this is a dumb idea, kinda new to the whole Clang thing :)!

Cheers,

-yawnt

[1] http://stackoverflow.com/questions/24728901/clang-compiling-a-c-header-to-llvm-ir-bitcode
[2] http://stackoverflow.com/questions/14032496/how-can-i-code-generate-unused-declarations-with-clang?noredirect=1&lq=1
[3] https://github.com/tjfontaine/node-ffi-generate

Hi everyone,

I was playing around with the LLVM toolchain and it occurred to me that,
should I decide to target LLVM IR, it would be neat to have Clang generate
declarations from C header files for easier FFI. Someone seems to have had
my same idea [1] [2]. This is kind of the opposite of what other projects
have gone for, where they use libclang to parse header files to achieve the
same result [3].

Sounds like an interesting idea. I how one question though, if you're
generating only the C function declarations in LLVM IR, how will your
frontend that targets LLVM IR call these declarations? Maybe I'm
misunderstanding something, but I don't see how can you generate the calls
to these functions without getting clang involved.

Cheers,
Alex

True,

I was toying with the idea of having a separate step in the pipeline that plugs in after AST is turned into a Module[*] which generates frontend code.
But to do so, I would have to have forward declarations be actually emitted, hence my proposal. Otherwise I’d just be iterating over an empty set of instructions :slight_smile:
My theory is that LLVM IR at that point is going to be way more straightforward (just basic types, structs and declare @..s) to deal with.
Could this approach make sense?

Cheers,

-yawnt

[*] a Clang plugin? Can they be run over Modules, that is LLVM IR, instead of AST?

If you are dealing with LLVM-IR, then it's probably a LLVM-pass, rather
than a Clang plugin. Or are you thiinking of something that understands
both source (in for exampel AST format) and LLVM-IR?

True,

I was toying with the idea of having a separate step in the pipeline that
plugs in after AST is turned into a Module[*] which generates frontend code.
But to do so, I would have to have forward declarations be actually
emitted, hence my proposal. Otherwise I'd just be iterating over an empty
set of instructions :slight_smile:
My theory is that LLVM IR at that point is going to be way more
straightforward (just basic types, structs and `declare @..`s) to deal with.
Could this approach make sense?

I see, so are you thinking about basically looking at the LLVM IR with the
C function declarations and generating the calls based on the IR
declarations yourself instead or relying on clang? I suppose that could
work, but I'm not 100% sure that it will be correct in all cases. How will
your front-end deal with functions that take and return C aggregate types?

@mats Yeah, you’re right, an LLVM Pass seems to be the correct approach here, however Alex is bringing up a fair point

@Alex

I was just checking by compiling the header files for libgit2 (in my clang fork) and you seem to be correct, there are some cases where informations are lost. Take the following example:

@mats Yeah, you're right, an LLVM Pass seems to be the correct approach
here, however Alex is bringing up a fair point

@Alex

I was just checking by compiling the header files for libgit2 (in my clang
fork) and you seem to be correct, there are some cases where informations
are lost. Take the following example:

-------------------------------------------------
struct xyz { int x; int y; int z; } xyz;

struct xyz fill(int x, int y, int z);

void pointer(struct xyz* d);
-------------------------------------------------

I do have correct informations for `pointer` (LLVM returns `declare void
@pointer(%struct.xyz*)`), but not for `fill` (LLVM returns `declare {
i64, i32 } @fill(i32, i32, i32)`). I suppose there's no way to get those
informations back as AST -> IR is a lossy transformation.

If I had some way to link the generated `declare` back to the function AST
node, that would obviously help in generating my frontend stub. But even
so, I would still need to follow the function's signature to get the
parameters' type definition, and so on. I'm beginning to wonder if it's
just more sane to use libclang, after all.

Yeah, the lowering to the IR by clang is lossy, and AFAIK there's no way to
go in the reverse direction. Maybe the debug information could allow you to
reverse the layout? In any case I think that's doing this reverse approach
isn't worth it if you can use clang directly.

Btw, there has been discussion recently about making a separate library
that handles this transformation process:
http://lists.llvm.org/pipermail/llvm-dev/2016-October/106660.html.

See also: http://llvm.org/devmtg/2014-10/#talk18

Oh, that’s actually even better than what I was thinking (and would precisely result in my desired outcome)

In the talk someone asks if there are any plans to implement an example to showcase the technique using Kaleidoscope. Was that ever published, by any chance?

Thank you so much for the link!

I don’t know, but in the meantime Swift has been released open-source. So there might be something to get from there.

CC Jordan and John (this is about your talk “Skip the FFI” http://llvm.org/devmtg/2014-10/#talk18 )

Oh, that’s actually even better than what I was thinking (and would precisely result in my desired outcome)

In the talk someone asks if there are any plans to implement an example to showcase the technique using Kaleidoscope. Was that ever published, by any chance?

I don’t know, but in the meantime Swift has been released open-source. So there might be something to get from there.

CC Jordan and John (this is about your talk “Skip the FFI” http://llvm.org/devmtg/2014-10/#talk18 )

Swift is using information from Clang’s IRGen library to lower to the C ABI, but that process is still more manual than I’d like; it’d be nice if there were a library capable of doing all the work to translate the arguments and generate the call.

John.

FYI, I believe we discussed having a GSOC to start this here: http://lists.llvm.org/pipermail/llvm-dev/2016-October/106579.html

Interesting you suggesting this, as I’m actually a student looking for an MSc thesis :stuck_out_tongue:

I’ll read up on that discussion!