[RFC] separating MLIR 'core' from everything else

Hi all,

I’d like to propose separating out several pieces of the MLIR core into a separate directory structure (name TBD). The goal of this would be to structurally separate two parts of the system which I believe are also logically separate, namely:

  1. the ‘core’, which is independent of any dialect.
  2. Dialects, and other things which depends on dialects (analyses, etc).

This is motivated by a desire to reflect the logical structure in the directory structure. It’s also motivated by some fundamental issues in the makefile generator for cmake. See https://reviews.llvm.org/D78771 and https://reviews.llvm.org/D78773. TL;DR the makefile generator in CMake generates recursive makefiles, which means that dependencies between subdirectories must (at least in some cases) be well-ordered. This order is not respected today WRT the core, particularly the linalg ODS generator, which lives in tools/, but depends on the include subdirectory and generates files in the includes subdirectory.

The ‘core’ would consist of (at least):
mlir/lib/Interfaces (not sure how much of this the verifier depends on)

I propose to move these to
include/mlir/core/… (for corresponding include files)

Alternatively, we could attempt to continue a ‘hack’ which exists today to support mlir-tblgen. mlir/CMakeLists.txt includes “tools/mlir-tblgen”, ensuring (among other things) that its variables are visible for other directories. We could also refer to all the above files in mlir/CMakeLists.txt, but this would result in a significant disconnect between the logical and structural hierarchies.

1 Like

I’m wary of motivating such change by the immediate Makefile issue: this same problem could repeat between dialects or other things. Your proposed layering is already somehow imperfect: mlir-linalg-ods-gen is not obviously a “core” tool at the moment (the name is giving it away…).

I agree with Mehdi, this seems weird and the layering doesn’t make sense. This seems to be motivated by the weird dependency chain for Verifier.cpp. IMO we should just move the verifier and DominanceInfo to IR/ like LLVM did. After that, it seems like there wouldn’t be any weird dependency issues.

The dependency issue still exists in some form if Verifier.cpp is moved to IR/. Also, mlir-tblgen has the same problem today, it’s just a ‘small’ workaround.
Another option would be to keep mlir-linalg-ods-gen with the ‘rest’, but it would simplify the hierarchy-breaking hack, essentially looking like mlir-tblgen does today in mlir/CMakeLists.txt:

> # Adding tools/mlir-tblgen here as calling add_tablegen sets some variables like
> # MLIR_TABLEGEN_EXE in PARENT_SCOPE which gets lost if that folder is included
> # from another directory like tools
> add_subdirectory(tools/mlir-tblgen)
> add_subdirectory(include/mlir)
> add_subdirectory(lib)
> add_subdirectory(unittests)
> add_subdirectory(test)
> # Tools needs to come late to ensure that MLIR_ALL_LIBS is populated.
> # Generally things after this point may depend on MLIR_ALL_LIBS or libMLIR.so.
> add_subdirectory(tools)

Slightly off-topic, but I would push even further and remove lib/Analysis altogether. Most of the things there, except for verifier and dominance info, are either dialect-specific (e.g., loop analysis) and live with the dialect or interface-based (e.g., call graph) and can live with the interface. The “core” part is then lib/IR and tools/mlir-tblgen.

What about an analysis that uses two interfaces? Ideally we’d like our interfaces to easily mix-and-match and work together for interesting new uses.