Just a note, I brought this up in the IREE Discord but for the sake of having all the information centralized, I’ll repeat some of that here.
I’m interested in adding support for Hexagon in IREE/MLIR. I’ve been poking around both code repos to try to figure out the best way to do that, but I’d love some advice.
Since LLVM itself can target Hexagon, and IREE uses LLVM for codegen, I figured the best way to do this would be just to add the necessary flags for LLVM to codegen for Hexagon. From the brief Discord discussion, it seems like the work should be done on the MLIR-codegen level since the MLIR transformations produce the best code for specific hardware targets.
What are people’s thoughts for this extension, and what would be the preferred way forward to get this working? I think the first step would be getting the Hexagon target working for a simple example (such as matmul or something), which to me seems like just propagating some flags down to LLVM codegen.
@nicolasvasilache, @aartbik, @ftynse, @asaadaldien – do we have any publicly available documentation/advice for what a person needs to know to bring up a new arch? (Or general diagrams of the layers and path involved)
I’m personally very interested in seeing something get established for hexagon but am definitely not the right person to be providing meaningful guidance at the lowest levels.
I’d suggest first target should be adding 2 numbers to just focus on plumbing So taking mlir-cpu-runner test which adds two scalars, specify a target and get that working [well which is assuming you are compiling on a machine where you can run Hexagon code, maybe just codegen is sufficient for plumbing].
But I agree with the former part and would love to see how it would intersect/make use of some of the vector and other abstractions to target Hexagon better.
do we have any publicly available documentation/advice for what a person needs to know to bring up a new arch?
I am not aware of any detailed doc for this. If LLVM itself already can target the new arch, the initial MLIR work should be relatively simple, assuming some basic runtime support library exists. Just start by lowering to LLVM IR dialect (perhaps using the Vector dialect as intermediate step, since we are most actively working on lowering that to good LLVM IR) and then
(1) JIT (if you can run on new arch itself): mlir-opt + mlir-cpu-runner
(2) AOT: mlir-opt + mlir-translate + opt / llc
with all the plumbing required to get the march flags right. Then, over time you want to introduce a target idiomatic lowering, probably using an arch specific dialect and e.g. intrinsics to convey idioms to the backend.
So are you envisioning these steps would be triggered by an arch flag? Is this what the other supported architectures do and if so, could you point me to where that’s at in the code?
Here is an E2E example for how to use LLVMAOT backend and run on android aarch64 https://google.github.io/iree/GetStarted/AndroidCMake
You will follow the same steps you will need hexegon linker i guess its distributed with toolchain.
Although of course over time you will want to bring in hexagon specifics.
Then the next steps bring in the arch specifics. For example, for AVX512 I use something like this to get assembly (similar approach to link to binary)
So here you would use hexagon specific LLVM backend flags.
For JIT, you could perhaps initially use the mlir-cpu-runner (which really just calls mlir::JitRunnerMain()), but over time, I would expect this to evolve into something like mlir-hexagon-runner to deal with arch specifics.
Aart and others have already given sufficiently detailed answers. If you can get to the LLVM dialect from IREE, you can pretty much just use the LLVM backend. Note that the dialect does not yet model the entire LLVM IR, so some extensions might be needed for specific use cases.
If you need Hexagon-specific intrinsics, it should be simple to add them as MLIR ops to the LLVM dialect using the generator (see this test for sample use).