MLIR for clang

Starting from May-June, we at “Compiler Tree” would start porting clang compiler to use MLIR as middle end target. If someone has already started a similar effort we would love to collaborate with them. If someone would like to work with us, we are ready to form a group and collaborate. If there are sharing opportunities from Fortran side, we would like to consider the same.

We are in the early phase of design for “C” part of the work. From our experience with (FC+MLIR) compiler, we are estimating that we would have an early cut of the compiler working with non-trivial workload within a quarter of starting of work.

Please ping me for any queries or concerns.

Regards,
-Prashanth

+cfe-dev

Hi Prashanth,

  Starting from May-June, we at "Compiler Tree" would start porting clang compiler to use MLIR as middle end target. If someone has already started a similar effort we would love to collaborate with them. If someone would like to work with us, we are ready to form a group and collaborate. If there are sharing opportunities from Fortran side, we would like to consider the same.\

That's a rather vague statement, considering the flexibility of MLIR.
Could you explain your plans in more detail, and what specifically you
hope to achieve with them?

Cheers,
Nicolai

Can you elaborate what your approach is? Do you intent to fork clang
for MLIR at a specific version, keep up-to-date with master and/or try
to upstream this?

Do you think MLIR has all the semantics required, such as for
representing exceptions?

Michael

Hi Michael-

  1. We intent to fork clang for MLIR at a particular release and develop. We will mostly merge with the master as soon as we reach a good milestones. Most of the development is expected to happen in github or some such version control system.
  2. MLIR is extensible and we are hoping that constructs like exceptions can be represented in MLIR. As we dive deep into design we might be able to answer the question in detail.

thanks,
-Prashanth

Speaking 1st hand here.

Note: There’s a number of internal lowering processes between clang and llvm and so I’ll use general terms to describe it for simplicity.

When “we” (PathScale) made clang emit High WHIRL instead of “whatever” it really isn’t as bad as some people around here may think. I’d guess it only took us about 2 years to go from zero to production quality and self hosting. This was multiple engineers working concurrently and dealing with a lot of legacy. I could easily see it taking less time if you don’t have to bring up advanced loop optimizations or care too much about EH stuff.

I’m not sure how much use or value it would be to anyone, but I have and control all of that code. Briefly, we hook into clang directly after AST and then swap out the IR “codegen” for what we coined WhirlGen.

I must admit that I feel a bit smug that design choices I made 10 years ago are finally being taken seriously around here.

Currently LLVM uses a low level IR for representing programs. Memory disambiguation does not happen accurately for constructs like multi-dimensional arrays. One of the ways we alleviate the same in LLVM currently is by using multiversioning of the code. By supporting a mid-level IR like MLIR we intend to keep the access indices of multidimensional array and do better disambiguation.

thanks,
-Prashanth

Fred Chow is a well known name in compiler community. He was the architect of Open64 compiler.
His comment on LLVM IR from open64 mailing list can be seen at : https://sourceforge.net/p/open64/mailman/message/23829398/
" From their name, LLVM roughly corresponds to Low WHIRL. I wonder how

LLVM tackles the compilation problems Open64 has tackled.  People with 
exposure to LLVM are welcome to chime in."


When you actually start to solve and implement this you’ll find that “LLVM IR” is actually mid-whirl on an almost 1:1 basis. (super close). However, we don’t exactly do IR<>IR translation and instead hook in at an API level. So it’s more matching constructs and trying to “get in where you fit in”. Since we had overlapping mid-whirl optimizations we had to figure out which to turn on/off on each side.

We also skipped VH Whirl for a number of reasons and just kinda cut it out. I don’t think Fred is subscribed to the list, but he isn’t the only smart person who worked on the compiler. There were a few managers, brilliant people and unsung heroes who worked at SGI.

Dror Maydan and Sun Chan were more intimately involved with the actual implementation of loop optimizations iirc. Fred’s best known for his low level codegen work and overall vision of compiler architecture.

Hi, Prashanth,

I definitely recommend that we have a discussion first on design goals for this. You’ve mentioned modeling of multidimensional arrays, and I know you’ve also been thinking about OpenMP, and it would be good to lay out the desired end state.

Part of the reason I say this is because there are significant design decisions that I suspect will appear up front. Handling of multidimensional arrays is a good example. C/C++ certainly do have multidimensional arrays of static extent, but these are largely irrelevant for a significant fraction of production C++ use cases. This is because, in many cases, the array bounds are not known statically, or at least they’re not all known statically, and so programmers make use of C++ wrapper libraries which provide the interface of multidimensional arrays implemented on top of one-dimensional heap-allocated data. If we create an infrastructure that works well for static multidimensional arrays but does not contain any provision for recognizing appropriate loop nests and also treating them using the multidimensional-array optimization infrastructure, we won’t really improve the compiler in practice for many, if not most, relevant production users.

It’s also going to be important what we optimize loops that only look like loops after coroutines are analyzed and inlined. Regardless, there certainly are areas in which we could do a better job optimizing constructs (e.g., more devirtualization, optimization of exception handling and uses of RTTI), and it would be good to put everything out on the table so that decisions can be made based on use cases as opposed to being driven by the desire to use a particular tool.

Thanks again,

Hal

Hi, Prashanth,

I definitely recommend that we have a discussion first on design goals for this. You've mentioned modeling of multidimensional arrays, and I know you've also been thinking about OpenMP, and it would be good to lay out the desired end state.

Is the goal for this to be an out-of-tree proof of concept, or is the goal to eventually integrate this into LLVM and have Clang compile by emitting MLIR as an intermediate stage? The latter would be a huge project with a lot of uncertain trade-offs, but I think it would be very interesting; whereas I’m afraid the former is not something I can spare any time to think about.

John.

Once you did the conversion, I think it becomes unfeasible to merge in
LLVM master on a regular basis. For instance, you will have changed
every occurrence of llvm::Instruction to mlir::Operation and its API,
affecting about every line of code starting with clangCodeGen. Even if
you introduce compatibility layer, upstream LLVM will continue to use
llvm::Instruction.

Another case is MLIR's use of BasicBlock arguments instead of PHINode.
This is a change we potentially would want in LLVM as well, but would
require every line that assumes PHI nodes at once, so has not been
done yet.

Michael

<prashanth.nr@gmail.com>:

  1. We intent to fork clang for MLIR at a particular release and develop. We will mostly merge with the master as soon as we reach a good milestones. Most of the development is expected to happen in github or some such version control system.

Once you did the conversion, I think it becomes unfeasible to merge in
LLVM master on a regular basis. For instance, you will have changed
every occurrence of llvm::Instruction to mlir::Operation and its API,
affecting about every line of code starting with clangCodeGen. Even if
you introduce compatibility layer, upstream LLVM will continue to use
llvm::Instruction.

Another case is MLIR’s use of BasicBlock arguments instead of PHINode.
This is a change we potentially would want in LLVM as well, but would
require every line that assumes PHI nodes at once, so has not been
done yet.

I don’t know how they are doing it, but if they hook in after AST like we do and make a whole knew SomethingGen/ it should be possible that regular rebase will work. That internal API isn’t stable, but I don’t remember a ton of code churn on either sides of the API that we connected with. Of course this is anecdotal, but it’s exactly the kind of feedback that may be helpful to others trying to achieve what I think they are aiming for.

dum dee doo

[ Dropping llvm-dev, because this seems to be entirely a discussion about clang ]

   Starting from May-June, we at "Compiler Tree" would start porting clang compiler to use MLIR as middle end target. If someone has already started a similar effort we would love to collaborate with them. If someone would like to work with us, we are ready to form a group and collaborate. If there are sharing opportunities from Fortran side, we would like to consider the same.\

That's a rather vague statement, considering the flexibility of MLIR.
Could you explain your plans in more detail, and what specifically you
hope to achieve with them?

I agree. There are several possible goals here and the approach that makes sense will vary. For example, if you just want to unify OpenMP support between clang and f18. This could be accomplished by mechanically translating the LLVM builder calls to builder calls for the LLVM IR embedding in MLIR and adding the other OpenMP-specific parts. There are a few more complex things that I can imagine being of value:

1. Embedding enough [Objective-]C[++] high-level semantic information in a new MLIR dialect that the static analyser can operate on this representation and it can then be lowered to LLVM IR.

2. Embedding the full C (or even C++) type system in MLIR such that other front ends can target a C ABI and reuse clang's MLIR -> LLVM IR lowering pass to incorporate ABI-specific details.

3. Deferring C++ and Objective-C dynamic dispatch lowering until later to retain more source-level type information for devirtualization or more precise CFI implementations.

These are the first things that pop into my head and each one imposes a different set of requirements on the MLIR dialect that you'll be lowering to (though it's also possible to imagine a superset that supports all of these use cases).

David

Hal-

Thanks for the critical issues to ponder over. We will get back to you once we have more clarity of the task.

thanks,
-Prashanth

I think we are talking about two different things here. Your approach
is to create a new SomethingGen next to (or replacing) clangCodeGen to
emit a different IR.
I was assuming Compiler Tree wants to change the entire mid-end
pipeline to the MLIR's LLVM dialect, but to begin with, clangCodeGen
could be converted only and MLIR's LLVM-IR conversion be used to pass
the result to the mid-end pipeline. One would not want to 'fork' to
clangCodeGen since it's mostly a mechanical change and merge conflicts
are still the easiest way to reflect upstream changes directly.

Michael

Personally I’m very interested in your second item: I just have too little bandwidth to look into this. But that is something in the back of my mind ever since I started working on MLIR (actually even long before).

Thanks David for good design pointers. We are thinking about the design issues and we will look into it soon.

-Prashanth

Hi Prashanth,

I’m presenting a talk next Wednesday at CGO’20 about MLIR, and will be talking about some “how and why could clang and llvm use mlir” concepts. I already hope to cover:

  • better separation of concerns in general.
  • Make abi lowering be clang-independent
  • Share OpenMP lowering across frontends, enable better openmp optimization (e.g. constant folding and hoisting across parallel loops is trivial)
  • enabling high level optimizations (e.g. insert std::vector::reserve calls based on data flow analysis)
  • Merging the clang CFG representation into the main flow

I’ll also mention some of the benefits of moving LLVM IR to MLIR - including things like multithreaded compilation, better location tracking, better modeling of invoke and other terminators, etc.

If anyone has any other specific things you’d like me to mention, please let me know! I’ll be happy to share the slides with this list after the talk. Thanks!

-Chris

Hi Prashanth,

I’d love to see this.

In terms of staging this in over time, have you considered starting by tackling the Clang “CFG” representation first? It is used for source level analysis (-Wunreachable, clang static analyzer) and would be much better as a “CIL” implemented in MLIR. From there, you could port Clang’s CodeGen/IRGen to be based on that IR instead of the AST. From there, you could factor other parts of IRGen out into their own independent MLIR lowering phases (e.g. ABI lowering etc).

The advantage of starting with the CFG representation is that the bar to getting it accepted into the tree is lower (Clang CFG isn’t complete) and swapping out one IR with another should not create a compile time regression - adding another phase could.

-Chris