A quick response is that the upstream MLIR vector dialect supports a range as broad as Google Highway, including x86 AVX2/AVX512, ARM NEON, ARM SVE, and RVV.
Thanks to @banach-space for mentioning my previous work. Regarding the implementation of architecture-specific support and general performance portability solutions, we have had a series of discussions and efforts in recent years. @Time0o You can refer to some of the following topics:
- RVV-specific design: [RFC] Add RISC-V Vector Extension (RVV) Dialect
- Scalable vector type for VLA platform (SVE and RVV): [RFC] Add built-in support for scalable vector types
- Dynamic vector representation design: [RFC] Dynamic Vector Semantics for the MLIR Vector Dialect
I am currently conducting research on this. As it has not been published yet, I cannot disclose the actual experimental data at this point, but I can broadly say that kernel performance implemented using the MLIR vector dialect is on par with Google Highway and even better on the RVV platform when the fixed/scalable type feature is correctly utilized.
I think the motivation for emitting Google Highway kernels in an MLIR-based compiler is quite limited. From a performance perspective, using the vector dialect directly is sufficient. From an implementation standpoint, using vector dialect is easier to maintain than emitting Google Highway code.