Unfortunately, a lot of the principles behind the design and development of the project around this kind of things is tribal knowledge of the people who spent hours debating this. Some of it in chatrooms (hard to search to find the needle in the haystack), some of it on Discourse (less of a haystack, but not always easy to pinpoint the right discussion), some of it in code review (another haystack), other in MLIR open meetings (recordings are online, but not searchable of course, maybe with a LLM?), and finally a lot of it in face-to-face meetings and whiteboard discussions when a majority of the MLIR team was sitting in the same open-space for the first few years of the project. While we tried to write rationale document on important aspects, it’s not everything that can be documented to this level of details in practice.
So it’s not great, but it’s the unfortunate reality. And I don’t think I have ever worked on a project (OSS or proprietary) where that isn’t the case (MLIR is far from the worst in my experience!).
A large part of the questions on this forum are actually calling to the expertise of people involved with the project for a long time (I tried to answer Clarifying the semantics of LTO post-link optimization recently for example, my knowledge is somehow outdated on the project, but it’s another example where documentation is lacking and tribal knowledge shared on the forum is the only way to figure some things out). Maybe a LLM processing all the history of the code, and all the code-review done from the beginning could help? (even then it would miss on some design docs lost online somewhere, or never shared publicly, as well as live discussion and white boarding sessions).
I don’t follow your reasoning here: canonicalization has some strong consideration which have nothing to do with “bloating” that pass (this blog post is often cited as good intro): it’s rather that not any transformation that is eligible to be a canonicalization in the first place, fundamentally.
Absolutely, that may smell: in general such patterns are borderline for canonicalization, and when adding them in a pass we haven’t done a good enough job (often during code review), to push on having a deeper reflection on the naming.
Just like the 5 whys it sometimes just requires to dig a bit to get to the essence of what we’re trying to achieve and converge to a better description/grouping that provide something more accurate that “simplify” (which is just a vague term in isolation).
Note that folding constant tensors has been split out of canonicalization before of adverse effect: the greedy constant-folding in absence of heuristics is appropriate for “register-like” data (we consider integer to be “free” to materialize) but that’s just not suitable to larger object (tensor constant can hundreds of KB or MB), and we need some more global analysis and cost model to perform constant folding there.
I don’t think that we saw a systemic need for <dialect>-fold-<some name for a related set of ops> passes so far? I should review the 3 or 4 passes we have with “fold” in the name to double check why they exists and if their documentation accurately describe it.