[RFC] Upstreaming ClangIR

Hi Bruno,

Yes, there will be two different paths from the AST to LLVMIR for quite some time. As mentioned in the RFC, there are no expectations for any maintenance or support from the community until ClangIR is proven to be worth it. That’s not a component of this proposal.

Then, the question may be, what if the plan when the ClangIR is proven? Just out of curiosity, not blocking questions.

For instance, take CIRGenFunction::buildCXXConstructExpr in clangir/clang/lib/CIR/CodeGen/CIRGenExprCXX.cpp at 4e069c6269dd51606a58773b0fb90089c90cc645 · llvm/clangir · GitHub

The equivalent one in CodeGen is CodeGenFunction::EmitCXXConstructExpr: clangir/clang/lib/CodeGen/CGExprCXX.cpp at 4e069c6269dd51606a58773b0fb90089c90cc645 · llvm/clangir · GitHub

Yeah, they are really similar. But the similar codes are the enemy of SE in my mind.

Great question. At some point we’d like to be able to serialize the AST to improve round trip testing with things that require the AST. So far we haven’t done any work in this direction though. My rough plan here would be to serialize the whole TU in a PCH-like approach, reusing the existing clang infra to do this job. I currently don’t think we’d need any extra work (besides plumbing the pieces and perhaps fixing bugs) to make it happen.

I feel the current framework for serializing AST may not be good to be reused for ClangIR. (Or I don’t know how can that be.) In my mind, the serialization for ClangIR should be more like the serialization of LLVM IR. Or if there is a serialization framework for MLIR, can we reuse that?

CIR is lower level than the AST and although it keeps the references around, it’s probably not a good fit for retaining the level of information needed for Modules.

It’s possible that CIR would be useful for doing some of the reachability/visibility analysis, but since a lot of the Modules logic is needed at Sema time (e.g. merging definitions), we’d need to have CIR being created during Sema, and I’m not really sure how that would play out. People also have asked in the past if CIR could be used for template instantiation (given MLIR handy tools for playing with types), and the answer is similar to this one: it’s possible, but so far we’re lower level and we know there are challenges if we’d start at Sema instead.

Good insight. And maybe CIR needs some additional work to handle modules. Otherwise the analysis may be not efficient. I mean we probably don’t want the CIR to analyze the codes imported from other TUs. But this should be minor points.

1 Like