[RFC] Create llvm/lib/Frontend

I was hoping to introduce a new top level library in llvm/lib/Frontend
for code that is (mainly) used by LLVM frontends but not by one
exclusively. At first, I would place the OpenMP-IR-Builder [1] (and related
code [0]) there. This Builder translates "OpenMP directives" to LLVM-IR
and is supposed to be reused in Flang.

First, I tried to place the OpenMP-IR-Builder into llvm/IR, right next
to the llvm::IRBuilder, but it would soon introduce a dependence on
other libraries (first TransformUtils) [2].

There are more things (especially parts of Clang) that could arguably be
shared across frontends and therefore be moved into a such a dedicated
location. If it turns out this is a controversial RFC, we will provide
examples and reasons.

I hope this is fairly straightforward as this does not introduce any
drawbacks (I know of).

Please let me know if you have an opinion on this.

Thanks,
  Johannes

[0] https://reviews.llvm.org/D69853
[1] https://reviews.llvm.org/D69785
[2] ⚙ D70109 [OpenMP][IR-Builder] Introduce "pragma omp parallel" code generation

"Doerfert, Johannes via llvm-dev" <llvm-dev@lists.llvm.org> writes:

I was hoping to introduce a new top level library in llvm/lib/Frontend
for code that is (mainly) used by LLVM frontends but not by one
exclusively.

+1

                     -David

I think the idea of more expanded frontend support library makes sense. The main use case that I’ve heard for such a library is to help frontends generate LLVM IR that interfaces with the local native C ABI.

However, I wonder if OpenMP should be its own (sub?) library, since I can’t imagine that Swift, Rust, or Julia will need this OpenMP logic. It seems specific to C and Fortran. I was thinking something like llvm/lib/Frontend/OpenMP.

I think the idea of more expanded frontend support library makes sense. The main use case that I’ve heard for such a library is to help frontends generate LLVM IR that interfaces with the local native C ABI.

I agree it would be good to have a library for shared frontend code, and the C ABI library is what I jumped to immediately too. I’m fine with it being part of LLVM proper, but I do wonder if it would be better as a separate top level project. Now that we have the monorepo it’s significantly less of a barrier for Clang to grow a dependency on another LLVM subproject. In fact I can’t really think of any reason it’s not free, pretty sure it requires zero extra steps while setting up a build.

  • Michael Spencer

> I think the idea of more expanded frontend support library makes sense.
> The main use case that I've heard for such a library is to help frontends
> generate LLVM IR that interfaces with the local native C ABI.
>

I agree it would be good to have a library for shared frontend code, and
the C ABI library is what I jumped to immediately too. I'm fine with it
being part of LLVM proper, but I do wonder if it would be better as a
separate top level project. Now that we have the monorepo it's
significantly less of a barrier for Clang to grow a dependency on another
LLVM subproject. In fact I can't really think of any reason it's not free,
pretty sure it requires zero extra steps while setting up a build.

Would it be OK to start with a library in LLVM core and determine then
if we need a project? I ask because of time reasons not because I oppose
the idea in any way. New subprojects have way more implications and
requirements than a new library in core would have.

> However, I wonder if OpenMP should be its own (sub?) library, since I
> can't imagine that Swift, Rust, or Julia will need this OpenMP logic. It
> seems specific to C and Fortran. I was thinking something like
> llvm/lib/Frontend/OpenMP.

I could easily do that.

(FWIW, I like the subproject idea even when I was actually hoping that
non-OpenMP languages could eventually piggyback on the OpenMP runtime
etc. at some point, but that is nothing I actively work on right now.)

There is no reasonable way to express this at the LLVM level.
The C ABI relies on C type-system concepts, and it doesn’t make any
sense to slowly clone the entire C type system into some LLVM-level
library when that’s already the purpose of the Clang AST. So this
should really be a Clang-level library.

Now, that Clang-level library could very reasonably be built on top
of an LLVM-level library. I think it would really nice to have an
LLVM-level library that provides basic abstractions analogous to
what Clang has as CodeGenFunction/CodeGenModule and Swift has as
IRGenFunction/IRGenModule. At the LLVM level, that function-builder
abstraction would basically be an IRBuilder plus some basic support
for control flow, which IRBuilder has never done well. The Clang-level
abstractions would then have functions for things like:

  • lowering a Clang type into an llvm::Type*
  • emitting a function prototype as an llvm::Function*
  • emitting calls given an llvm::FunctionBuilder
  • accessing a field of a Clang struct type
  • etc.

I’ve thought about trying to build some of these abstractions on
top of IRBuilder, but it really isn’t good enough because you need some
more holistic things when you’re building a function from scratch.
For example, it’s really useful to maintain a separate insertion point
in the entry block so that allocas show up in the same order you created
them instead of reverse order. And it’s good to have well-thought-out
conventions for how you deal with reaching the end of a block and so on.

Anyway, if we had that Clang-level library, t would be fairly
straightforward for someone interoperating with the C ABI to just
construct the appropriate Clang types and call those APIs.

John.

The main motivation is to have ABI-emitting code (in particular for
OpenMP) shared between clang and flang. This structure would require
flang to have a dependency on clang, but conceptually they are
independent frontends.

De facto, every platform has its ABI defined in terms of C therefore
this would effectively required every frontend to depend on clang. I'd
think of the platform ABI to be independent of the parsing
implementation.

Michael

Fortunately, nothing about this structure would require you to link
the Clang parser, just the AST.

John.

I was hoping to introduce a new top level library in llvm/lib/Frontend
for code that is (mainly) used by LLVM frontends but not by one
exclusively. At first, I would place the OpenMP-IR-Builder [1] (and related
code [0]) there. This Builder translates "OpenMP directives" to LLVM-IR
and is supposed to be reused in Flang.

First, I tried to place the OpenMP-IR-Builder into llvm/IR, right next
to the llvm::IRBuilder, but it would soon introduce a dependence on
other libraries (first TransformUtils) [2].

Hi Johannes,

I think it is a great idea to share the OpenMP lowering code, but I’m concerned about the name 'lib/Frontend’. This is a very broad name and there are lots of things that “could be useful for frontends” - and a lot of definitions of what a “frontend" is.

WDYT about naming it something like "lib/OpenMPGen” and generalizing/renaming it later when the scope is more clear?

-Chris

While it isn’t immediately useful given how Clang works today, I feel obligated to point out that MLIR provides a new and simple way to handle these sorts of problems:

MLIR allows us to define an MLIR dialect that reflects these C-level abstractions in terms of the C level type system. The payoff of this is that the Clang ABI lowering logic could be moved to a reusable library that is separate from the rest of the concerns of Clang’s “CodeGen” module.

MLIR allows you to directly use the existing AST clang::Type if you’d like to, but also allows defining standalone types and allows using hybrids as well. This could be particularly useful for C type lowering since the C type system is so much simpler than the C++ type system, and not every client wants to carry around the complexity of C++: this design allows factoring that out to a separate module that one can opt into.

In any case, while such a design is possible, implementing such a thing (and refactoring Clang) would be a tremendous amount of work. I just mention this because many folks on this thread may not be aware of the fact that we have a much larger design space to work with to solve lowering problems and the software engineering problems like code reuse than we once did.

As these things are adopted more across the ecosystem, I think it could bring a lot of increased modularity benefits to LLVM!

-Chris

>
> I was hoping to introduce a new top level library in llvm/lib/Frontend
> for code that is (mainly) used by LLVM frontends but not by one
> exclusively. At first, I would place the OpenMP-IR-Builder [1] (and related
> code [0]) there. This Builder translates "OpenMP directives" to LLVM-IR
> and is supposed to be reused in Flang.
>
> First, I tried to place the OpenMP-IR-Builder into llvm/IR, right next
> to the llvm::IRBuilder, but it would soon introduce a dependence on
> other libraries (first TransformUtils) [2].

I think it is a great idea to share the OpenMP lowering code, but I’m
concerned about the name 'lib/Frontend’. This is a very broad name
and there are lots of things that “could be useful for frontends” -
and a lot of definitions of what a “frontend" is.

Fair point. I'm open to suggestions wrt. the name. FWIW, the name was
suggested because it was not only supposed to be OpenMP stuff (soon).

WDYT about naming it something like "lib/OpenMPGen” and
generalizing/renaming it later when the scope is more clear?

I can do that. It was also proposed to do "lib/Frontend/OpenMP" or
something similar. Would that help or would you prefer not to have the
top level "Frontend" until we actually move other things?

First, I tried to place the OpenMP-IR-Builder into llvm/IR, right next
to the llvm::IRBuilder, but it would soon introduce a dependence on
other libraries (first TransformUtils) [2].

I think it is a great idea to share the OpenMP lowering code, but I’m
concerned about the name 'lib/Frontend’. This is a very broad name
and there are lots of things that “could be useful for frontends” -
and a lot of definitions of what a “frontend" is.

Fair point. I’m open to suggestions wrt. the name. FWIW, the name was
suggested because it was not only supposed to be OpenMP stuff (soon).

What are the other things? Naming is often helped by having examples from the broader pool that will shape this. It will also be helpful to understand what would be “in” vs “out” of scope for this over time.

WDYT about naming it something like "lib/OpenMPGen” and
generalizing/renaming it later when the scope is more clear?

I can do that. It was also proposed to do “lib/Frontend/OpenMP” or
something similar. Would that help or would you prefer not to have the
top level “Frontend” until we actually move other things?

Sure,

-Chris

Sorry for my delay.

>
>>>
>>> First, I tried to place the OpenMP-IR-Builder into llvm/IR, right next
>>> to the llvm::IRBuilder, but it would soon introduce a dependence on
>>> other libraries (first TransformUtils) [2].
>>
>> I think it is a great idea to share the OpenMP lowering code, but I’m
>> concerned about the name 'lib/Frontend’. This is a very broad name
>> and there are lots of things that “could be useful for frontends” -
>> and a lot of definitions of what a “frontend" is.
>
> Fair point. I'm open to suggestions wrt. the name. FWIW, the name was
> suggested because it was not only supposed to be OpenMP stuff (soon).

What are the other things? Naming is often helped by having examples
from the broader pool that will shape this. It will also be helpful
to understand what would be “in” vs “out” of scope for this over time.

One is toolchain handling, which should be available to the linker
plugin so we can do fat binary linking without going through the clang
driver.

There is a desire to share driver code in here as well, Peter and Mark
can probably elaborate on that.

I predict we find more use cases over time.

>> WDYT about naming it something like "lib/OpenMPGen” and
>> generalizing/renaming it later when the scope is more clear?
>
> I can do that. It was also proposed to do "lib/Frontend/OpenMP" or
> something similar. Would that help or would you prefer not to have the
> top level "Frontend" until we actually move other things?

Sure,

So `lib/Frontend/OpenMP` for the OpenMP specific part seems to be a
popular option. I'd like to go ahead with this soon, please let me know
if there is disagreement. FWIW, it should be easy to rename/move the
code later anyway.

What are the other things? Naming is often helped by having examples
from the broader pool that will shape this. It will also be helpful
to understand what would be “in” vs “out” of scope for this over time.

One is toolchain handling, which should be available to the linker
plugin so we can do fat binary linking without going through the clang
driver.

There is a desire to share driver code in here as well, Peter and Mark
can probably elaborate on that.

I predict we find more use cases over time.

Got it - the thing is, each of these use cases will have different dependencies other implications. I think that splitting them into separate subdirectories under lib/Frontend/ is a good way to go.

WDYT about naming it something like "lib/OpenMPGen” and
generalizing/renaming it later when the scope is more clear?

I can do that. It was also proposed to do "lib/Frontend/OpenMP" or
something similar. Would that help or would you prefer not to have the
top level "Frontend" until we actually move other things?

Sure,

So `lib/Frontend/OpenMP` for the OpenMP specific part seems to be a
popular option. I'd like to go ahead with this soon, please let me know
if there is disagreement. FWIW, it should be easy to rename/move the
code later anyway.

SGTM, thanks!

-Chris