Last year, when I wanted to share a prototype dialect I was creating, I had two choices: share my entire project or share the entire LLVM monorepo + my dialect. I’ve done both, really, and neither were good.
My project was banking on the idea that MLIR is highly composable, taking advantage of the existing dialects. However, either sharing strategies meant I could only reuse the standard dialects already upstream. If I wanted to reuse two other dialects, I’d have to somehow clone both their repositories and pull the files into some place to build them “correctly”, probably writing my own CMake kludge on the way.
Now, writing our own front-end, I’d like to maybe reuse things from FIR or CIL, and well, I’d love if they could reuse from each other, too! FIR would be slightly simpler, as it’s in (will be in?) the monorepo (and we already clone that), bit CIL isn’t. Sure, “once CIL gets into the monorepo”, but if that’s the only way to easily reuse, then this model seems broken.
Looking at most dialects, we put all logic (TD, headers, code, CMake) into a single directory and then include that from the parent’s CMake. This seems to be like a perfect encapsulation strategy to have dialects be on their own projects (and repos) which other projects (like front-ends, optimisers) can use and reuse by just cloning/submodule the repo on their own.
I’m not very proficient with CMake (by choice!), but wouldn’t some strategy like that be a more sensible “default dialect sharing” strategy? This could be prominent on the documentation, so that all developers out there know how to simply share their dialects, and “facilitate synergy between projects” (spoken like a true marketeer:).
Any thoughts on this? Am I missing something obvious?