[RFC] Upstreaming a proper SPIR-V backend

Hi all,

We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:

Abstract

Hi,

Perhaps a parallel question: how does that integrate with MLIR’s SPIRV back-end?

If this proposal goes through and we have a production-quality SPIRV back-end in LLVM, do we remove MLIR’s own version and lower to LLVM, then to SPIRV? Or do we still need the MLIR version?

In a perfect world, translating to LLVM IR then to SPIRV shouldn’t make a difference, but there could be some impedance mismatch between MLIR->LLVM lowering that isn’t compatible with SPIRV?

But as a final goal, if SPIRV becomes an official LLVM target, it would be better if we could iron out the impedance problems and keep only one SPIRV backend.

cheers,
–renato

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR -> LLVM-IR with SPIR-V intrinsics (let's call it a LLVM IR SPIR-V dialect) -> SPIR-V binary (generated by a backend). So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.

regards,
konrad

Hi,

Perhaps some obvious addition to Konrad’s answer - a proper LLVM backend for SPIR-V can make it much easier for people who’re already using LLVM for codegen purposes (targeting e.g. AArch64 or x86 CPUs) to simply retarget their flow with one (ideally) option changed in cmdline.

Thanks,
Alex

вт, 2 мар. 2021 г. в 14:07, Trifunovic, Konrad via llvm-dev <llvm-dev@lists.llvm.org>:

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

Excellent.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

I see. It is unfortunate that we have a shader focused backend in one side and a compute focused in another. As you say, it means we can only move the SPV MLIR lowering to LLVM once the LLVM side also supports it.

I’m guessing the support is not a trivial addition to compute and that it will probably take place after the current proposal is mostly done.

It is unfortunate, but not altogether bad. I think it would be fine for them to co-exist until the time we unify.

So the idea of ‘SPIR-V dialect’ still exists, it is just now expressed at the LLVM IR level.

Indeed, that’s what I meant.

Thanks!
–renato

Little expertise to help but looking forward to it happening.

~ Johannes

Chiming in mostly from the perspective of MLIR SPIR-V support. More comments inlined, but first some general comments. :slight_smile:

As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it’s 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It’s unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet. My understanding over LLVM is it’s mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn’t much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on. This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too… Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality.

That’s why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR’s open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don’t take this as a negative feedback in general. It’s just that to me it’s unclear how we can unify here right now. Even when the time arrives for unification, I’d believe going through MLIR is better to have general SPIR-V support. :slight_smile:

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions, e.g. in Vulkan we have logical adressing mode).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

I think we have an assumption here: LLVM itself should support all mechanisms and use cases SPIR-V can support, if to make LLVM a layer before SPIR-V. I think there is a huge gap here.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute.

To be specific, Vulkan compute is the most well supported use case right now. But there are interests from the community to push on Vulkan graphics and OpenCL.

Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).

I actually believe the opposite, because of the reasons I listed at the very beginning. To me SPIR-V also stays at a higher level than LLVM. (But again, depending on what subset we are talking about.)

My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR → LLVM-IR with SPIR-V intrinsics (let’s call it a LLVM IR SPIR-V dialect) → SPIR-V binary (generated by a backend). So the idea of ‘SPIR-V dialect’ still exists, it is just now expressed at the LLVM IR level.

Not sure this is the prefered way, given that we can define SPIR-V ops easily in MLIR in its own dialect with native support for various aspects.

Thank you for such a detailed response!

Honestly, I don’t know much about SPIRV, so my comments were without context. If there are reasons to keep the back-end on both sides, I’m not against it.

I just proposed unifying things in case they’re duplicated. If we can make that case, then it should definitely be part of the plan.

cheers,
–renato

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an “LLVM IR 2.0 – Generic Edition”, but not yet actually layered underneath LLVM where it really wants to be. I think it doesn’t really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The “proper” solution probably won’t be possible any time soon.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what’s the purpose? It wouldn’t really help move towards the longer term goal, I don’t think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Please some some comments inlined

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.

What do you mean by lower? I’m not that familiar with the way MLIR deals with SPIR-V binaries, but isn’t it still necessary to convert SPIR-V dialect to LLVM and then use some hardware-tied codegen to be able to run a SPIR-V binary?

I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

Oh, this sounds interesting actually. Would be nice if someone has any materials or code to share on the topic.

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an “LLVM IR 2.0 – Generic Edition”, but not yet actually layered underneath LLVM where it really wants to be. I think it doesn’t really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The “proper” solution probably won’t be possible any time soon.

There’s definitely some consensus or even roadmap/timeline on this transition missing IMO :slight_smile: And pls forgive me my possibly stupid question, but is there any way now to conveniently incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have ‘llvm’ dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn’t public yet, is it?

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what’s the purpose? It wouldn’t really help move towards the longer term goal, I don’t think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR → LLVM-IR with SPIR-V intrinsics (let’s call it a LLVM IR SPIR-V dialect) → SPIR-V binary (generated by a backend). So the idea of ‘SPIR-V dialect’ still exists, it is just now expressed at the LLVM IR level.

regards,
konrad

From: Renato Golin <rengolin@gmail.com>
Sent: Tuesday, March 2, 2021 11:12 AM
To: Trifunovic, Konrad <konrad.trifunovic@intel.com>
Cc: llvm-dev@lists.llvm.org; Paszkowski, Michal <michal.paszkowski@intel.com>; Bezzubikov, Aleksandr <aleksandr.bezzubikov@intel.com>; Tretyakov, Andrey1 <andrey1.tretyakov@intel.com>
Subject: Re: [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

Hi all,

We would like to propose this RFC for upstreaming a proper SPIR-V backend to LLVM:

Hi,

Perhaps a parallel question: how does that integrate with MLIR’s SPIRV back-end?

If this proposal goes through and we have a production-quality SPIRV back-end in LLVM, do we remove MLIR’s own version and lower to LLVM, then to SPIRV? Or do we still need the MLIR version?

In a perfect world, translating to LLVM IR then to SPIRV shouldn’t make a difference, but there could be some impedance mismatch between MLIR->LLVM lowering that isn’t compatible with SPIRV?

But as a final goal, if SPIRV becomes an official LLVM target, it would be better if we could iron out the impedance problems and keep only one SPIRV backend.

cheers,
–renato


LLVM Developers mailing list
llvm-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


LLVM Developers mailing list
llvm-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


LLVM Developers mailing list
llvm-dev@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Thanks,Alex

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an “LLVM IR 2.0 – Generic Edition”, but not yet actually layered underneath LLVM where it really wants to be.

I don’t understand what you mean here with “layered underneath LLVM”? Can you elaborate on this?

I think it doesn’t really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The “proper” solution probably won’t be possible any time soon.

I’m not sure if we’re talking about the same thing here: there is nothing that I suggest that would operate at the level of LLVM IR. And nothing that requires a “long timescale”, it seems quite easily in scope to me here.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what’s the purpose? It wouldn’t really help move towards the longer term goal, I don’t think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Do we want to maintain, in the LLVM monorepo, two different implementations of a SPIRV IR and associated serialization (and potential deserialization)? All the tools associated to manipulate it? I assume the backend may even want to implement optimization passes, are we gonna duplicate these as well?
(note that this isn’t at the LLVM IR level, but post-instruction selection, so very ad-hoc to the backend anyway).0

Please some some comments inlined

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.

What do you mean by lower? I’m not that familiar with the way MLIR deals with SPIR-V binaries, but isn’t it still necessary to convert SPIR-V dialect to LLVM and then use some hardware-tied codegen to be able to run a SPIR-V binary?

What you’re describing seems a bit orthogonal to the SPIRV backend: you’re asking “how would someone run a SPIRV binary”. That up to the SPIRV runtime implementation (it may or may not use LLVM to JIT the SPIRV to the native platform).
From what I understand, the proposal about a backend here is exclusively about a “LLVM → SPIRV” flow, i.e. SPIRV is the abstract ISA (like NVPTX) and the final target of the workflow.

I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

Oh, this sounds interesting actually. Would be nice if someone has any materials or code to share on the topic.

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an “LLVM IR 2.0 – Generic Edition”, but not yet actually layered underneath LLVM where it really wants to be. I think it doesn’t really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The “proper” solution probably won’t be possible any time soon.

There’s definitely some consensus or even roadmap/timeline on this transition missing IMO :slight_smile: And pls forgive me my possibly stupid question, but is there any way now to conveniently incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have ‘llvm’ dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn’t public yet, is it?

Not that I am aware of, but I haven’t followed the most recent development either! We’re very interested into looking into this though :slight_smile:

+1

It would be nice to have this real after 6+ years of various projects
flying around and diluting the efforts among various SPIR-V consumers
and producers...

As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it's 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It's unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet.

Agreed. Indeed, 'production quality SPIR-V backend' is vaguely defined here and we proposed one discussion point on this. For this proposal needs, we should focus on one subset of SPIR-V and one use-case (OpenCL). By production quality I mean that we can correctly produce the code for that subset of SPIR-V. I totally agree that having the 'Full SPIR-V' coverage is something very broad and probably not achievable at all - but we are not aiming at that. I do take the perspective of classical 'CPU backend' here: we do have to generate the ISA code for the input LLVM-IR code. Now, besides instructions, our backend needs to deduce proper capabilities and extensions, based on what subset of instructions is selected. As You pointed out later, plain LLVM-IR is not capable of describing the full SPIR-V. Some of decorations/extensions/capabilities might be deduced by the backend, while some need to be declared using various LLVM-IR concepts, such as metadata, attributes, intrinsics - and that needs a clear definition.

My understanding over LLVM is it's mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn't much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on.

Also agree. I do believe though, that LLVM-IR is still worth the effort, and we can take an incremental approach into adopting some GPU concepts into LLVM-IR. (e.g. 'convergent' attribute has been added mainly for GPU kind of targets). The first step is defining the metadata and intrinsics that are target specific for SPIR-V, but most of them could be generalized as GPU concepts and even introduced into core LLVM-IR spec. I agree this is great effort - and not really the main focus point of this proposal - yet, we should give LLVM-IR it's own right into the world of GPUs.

This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too.. Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality.

I would focus on just a subset and clearly define what input LLVM-IR GPU dialect would look like for that subset.

That's why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR's open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

Right. For the 'General SPIR-V' support, MLIR is the right abstraction level to use. And I would keep it that way. For the 'specific'/legacy uses, backend is the way to fill that gap.

With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don't take this as a negative feedback in general. It's just that to me it's unclear how we can unify here right now. Even when the time arrives for unification, I'd believe going through MLIR is better to have general SPIR-V support. :slight_smile:

A very good discussion! I seem to be overly optimistic at the first place at unifying those two approaches. Now I believe that we actually should have two paths, for the reasons You have just explained and for the reasons of supporting 'legacy' paths/compilers that rely on a classical, years old approach: Front-End -> LLVM-IR (opt) -> backend (llc). For that legacy path, a plain old 'backend' approach is still (in my view) the way to go. On the other hand, when MLIR evolves and gets wider adoption, it will be the way to go. From the semantic point of view, MLIR is much better suited for representing structured and extensible nature of SPIR-V. But for MLIR approach to be adopted, new languages/front-ends need to be aware of that structure, so to take most of the advantage of it. If Clang C/C++ start to use MLIR as its native generation format - that would be a big case for MLIR approach, but until that happens, we need to have some intermediate solution.

The most important use-case for the backend is the systems/languages that have been targeting x86, ARM or NVPTX, AMDGPU. You want to replace that particular backend with some other GPU backend (e.g. Intel GPU :slight_smile: ). So the solution is to use SPIR-V as the backend target, and then consume SPIR-V with a proprietary/open-source GPU finalizer that eventually produces GPU assembly. (We did not want to come up with yet another GPU intermediate language like PTX, HSAIL etc. and want to use Khronos standard SPIR-V for that purpose). In this use case, the stress is on points 1) and 2) and less on point 3) as You stated earlier.

So my proposal is to keep two paths. They are complementary to each other. I know of the maintenance cost concerns - but while there are use-cases for both, it is still worth it. When the worlds stops using LLVM-IR, it will die silently, so will die the SPIR-V backend - but that is a natural software lifecycle :wink:

As it comes to unifying, there seems little implementation could be unified (unless we make GlobalISel produce MLIR SPV dialect😊). Nevertheless, we might collaborate on a conceptual level, especially on defining the subset of SPIR-V that we want to support, what use cases (OpenCL, Vulkan compute, OGL?) are relevant for LLVM community and to have a common vision there.

BTW: Intel is also interested in MLIR path and there is a group actively contributing in that direction too.
...

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).

I actually believe the opposite, because of the reasons I listed at the very beginning. To me SPIR-V also stays at a higher level than LLVM. (But again, depending on what subset we are talking about.)

Here by 'lower level in the compiler stack' I did not mean the higher/lower semantics level, but the place in the compiler pipeline (front-end -> optimizer -> back-end), where I assumed MLIR is at front-end level, optimizer is LLVM-IR, and back-end comes last.
I agree that semantically, SPIR-V is higher level than LLVM-IR, especially when it comes to other meta-data that LLVM-IR does not support natively (extensions/extended instruction sets/capabilities/execution model etc.)

My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

PS: one more thought: SPIR-V does come with a set of builtin/intrinsic functions that expose the full capabilities of target architecture (mostly GPU). This set of intrinsics is actually a dialect in its own. So this is LLVM IR + SPIR-V specific intrinsics and their semantics that fully define the SPIR-V dialect at LLVM IR level. I believe this idea could be used in MLIR path: MLIR -> LLVM-IR with SPIR-V intrinsics (let's call it a LLVM IR SPIR-V dialect) -> SPIR-V binary (generated by a backend). So the idea of 'SPIR-V dialect' still exists, it is just now expressed at the LLVM IR level.

Not sure this is the prefered way, given that we can define SPIR-V ops easily in MLIR in its own dialect with native support for various aspects.

Agree. Having in mind that we should actually keep both paths, I believe this path of going MLIR -> LLVM-IR -> SPIR-V then does not make sense as it might lose some information.

regards,
konrad

I think there are two points here:

  1. How many SPIRV end-points we have

This is mostly about software engineering concerns of duplication, maintenance, etc. But it’s also about IR support, with MLIR having an upper hand here because of the existing implementation and its inherent flexibility with dialects.

It’s perfectly fine to have two back-ends for a while, but since we moved MLIR to the monorepo, we need to treat it as part of the LLVM family, not a side project.

LLVM IR has some “flexibility” through intrinsics, which we could use to translate MLIR concepts that can’t be represented in LLVM IR for the purpose of lowering only. Optimisations on these intrinsics would bring the usual problems.

  1. Where do the optimisations happen in code lowering to SPIRV

I think Ronan’s points are a good basis for keeping that in MLIR, at least for the current code. Now, if that precludes optimising in LLVM IR, than this could be a conflict with this proposal.

If the code passes through MLIR or not will be a decision of the toolchain, that will pick the best path for each workload. This allows us to have concurrent approaches in tree, but also makes it hard to test and creates corner cases that are hard to test.

So, while I appreciate this is a large proposal, that will likely take a year or more to get into shape, I think the ultimate goal (after the current proposal) should be that we end up with one back-end.

I’m a big fan of MLIR, and I think we should keep developing the SPIRV dialect and possibly this could be the entry point of all SPIRV toolchains.

While Clang will take a long time (if ever) to generate MLIR for C/C++, it could very well generate MLIR for non-C++ (OpenCL, OpenMP, SYCL, etc) which is then optimised, compiled into LLVM IR and linked to the main module (or not, for multi-targets) after high-level optimisations.

This would answer both questions above and create a pipeline that is consistent, easier to test and with lower overall maintenance costs.

cheers,
–renato

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a 'serialize' option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.
The biggest 'impedance mismatch' that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR -> LLVM-IR translated code in the testing so to have this final goal in mind.

Something you're missing here, and maybe Lei clarified but I'll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can't figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here...

By 'lower' I was referring to the place of backend in a typical compiler flow that I could imagine: MLIR -> LLVM-IR (opt) -> Bakcend (llc).
And yes, I agree, if we treat MLIR SPV dialect as a final result of what this backend would produce, then MLIR SPV could be the lowest-level representation (before streaming into SPIR-V binary).

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven't seen it, but there was a talk last Sunday on this topic: The LLVM Compiler Infrastructure Project

We should investigate that. I believe though that GlobalISel is not really that flexible to produce MLIR (or dialects) - but that is something we might want to change :blush: That path would open us a door to have a great deal of unification:
We can support two 'entry points' :
1) Directly through MLIR. It gets translated to SPV dialect, and then streamed to SPIR-V binary. (without even going into LLVM-IR)
2) Start with LLVM-IR (with some augmented metadata and intrinsics). Feed that into proposed SPIR-V backend. Backend will produce MLIR SPV dialect and make use of whatever legalization/binary emission/etc. it provides.
This way, SPIR-V LLVM backend will be (a probably tiny) wrapper around MLIR SPV. Then, the majority of work would focus on MLIR SPV (e.g. adding support for OpenCL environment in addition to existing Vulkan compute).

From the implementation point of view, that would bring us huge re-use. Still, from the design point of view, we need to maintain two 'GPU centric' representations': LLVM-IR with SPIR-V intrinsics/metadata/attributes + MLIR SPV dialect.
Still that would be a much better situation from the community point of view.

Hi,

A very good question. I was actually expecting it :blush:

So, at the moment, it does not integrate into MLIR SPIRV backend and we have not thought about it. I guess You are referring to having a SPV dialect in MLIR and using a ‘serialize’ option to produce a SPIR-V binary?

I agree that developing two backends in parallel is a bit redundant. If SPIR-V LLVM backend becomes a production quality it means actually it could consume any LLVM IR (provided it does conform to some SPIR-V restrictions).
By any LLVM IR input I mean: it should be irrelevant whether it is produced by a clang, MLIR to LLVM IR lowering or just some other front-end that produces LLVM IR.

The biggest ‘impedance mismatch’ that I currently see is that SPV MLIR dialect is now targeted mostly at Vulkan, while LLVM SPIR-V backend targets compute. Besides instruction set, the fundamental difference is a memory model.
So if we want to unify those, we should actually make SPIR-V LLVM backend able to produce Vulkan dialect of SPIR-V as well.

My answer is a bit elusive, but I totally agree with Your proposal: we should work towards having a one solution, and, LLVM SPIR-V backend seems like a more universal one (since it sits lower in the compiler stack).
My proposal would be to include some MLIR → LLVM-IR translated code in the testing so to have this final goal in mind.

Something you’re missing here, and maybe Lei clarified but I’ll reiterate: the SPIRV dialect in MLIR is equivalent to what your GlobalISel pass will produce. It can actually round-trip to/from the SPIRV binary format. So it is sitting lower than your backend in my view.
I can’t figure out a situation where it would make sense to go from MLIR SPIRV dialect to LLVM to use this new backend, but I may miss something here…

It would be really great to find a common path here before duplicating a lot of the same thing in the lllvm-project monorepo, for example being able to target the MLIR dialect from GlobalISel, or alternatively converting the MIR to it right after would be an interesting thing to explore.
I haven’t seen it, but there was a talk last Sunday on this topic: https://llvm.org/devmtg/2021-02-28/#vm1

This sort of problem seems like just one of those unfortunate consequences of MLIR being effectively an “LLVM IR 2.0 – Generic Edition”, but not yet actually layered underneath LLVM where it really wants to be.

I don’t understand what you mean here with “layered underneath LLVM”? Can you elaborate on this?

That ultimately the goal should be for LLVM IR to be a dialect of MLIR, and for much of the optimization and codegen processes in LLVM to be implemented as MLIR dialect lowering. Then, MLIR is foundational – “layered” underneath LLVM’s core – LLVM would have a hard dependency on MLIR.

At that point, SPIR-V as an MLIR dialect, and the SPIR-V backend doing MLIR dialect lowering would be effectively no different from how every target works – just with a different output dialect.

I think it doesn’t really make sense to tie this project to those long-term goals of layering MLIR under LLVM-IR, given the extremely long timescale that is likely to occur in. The “proper” solution probably won’t be possible any time soon.

I’m not sure if we’re talking about the same thing here: there is nothing that I suggest that would operate at the level of LLVM IR. And nothing that requires a “long timescale”, it seems quite easily in scope to me here.

So, in the meantime, we could implement a special-case hack just for SPIRV, to enable lowering it to MLIR-SPIRV dialect. But, what’s the purpose? It wouldn’t really help move towards the longer term goal, I don’t think? And if someone does need that at the moment, they can just feed the SPIRV binary format back into the existing MLIR SPIRV dialect, right?

Do we want to maintain, in the LLVM monorepo, two different implementations of a SPIRV IR and associated serialization (and potential deserialization)? All the tools associated to manipulate it? I assume the backend may even want to implement optimization passes, are we gonna duplicate these as well?
(note that this isn’t at the LLVM IR level, but post-instruction selection, so very ad-hoc to the backend anyway).0

Quite possibly yes. It’s unfortunate to have duplication, but given the current state of things, I think it should not be ruled out.

My inclination is that the following factors are likely to be true:

  • The amount of code for SPIRV binary format serialization is not particularly large or tricky.

  • The work to emit SPIR-V MLIR dialect from the LLVM SPIR-V backend will not be simpler than serializing to SPIR-V directly.

  • Writing this custom code to emit SPIR-V MLIR dialect from the SPIR-V backend will not noticably further the longer-term goals of having LLVM core be implemented as MLIR dialect lowering.

It seems to me that the choice here is either writing new code in LLVM to emit the SPIR-V MLIR dialect in the GlobalISel SPIR-V backend, or new code in LLVM to emit SPIR-V directly. And while I find the long-term prospects of MLIR integration into LLVM extremely promising, using MLIR just as step-stone to MLIR SPIR-V serialization does not seem particularly interesting.

So, to me the interesting question is whether we’d expect to be doing something interesting after converting to the SPIR-V MLIR dialect form besides simply serializing to SPIR-V binary format. Something that would make the added complexity of serializing through MLIR seem more worthwhile. I guess I’m not immediately seeing this as likely to be the case, but it seems well worth further discussion.

A possibility you’ve mentioned is post-instruction-selection optimizations. Do you have something in particular in mind there?

As I understand it, SPIR-V is actually a mix of multiple things. It is first and foremost 1) a binary format for encoding GPU executables that cross the toolchain and hardware driver boundaries. Then it’s 2) an intermediate level language for expressing such GPU executables. It is also 3) a flexible and extensible spec with all sorts of capability and extension mechanisms in order to support the needs of multiple APIs and hardware features. It’s unclear to me what a production-quality SPIR-V LLVM backend would entail; but to actually support various use cases SPIR-V can support (OpenCL, OpenGL, Vulkan; shader/kernel; various levels of extensions; etc.), it looks to me that we need a story for all the above points, where the IR aspect (2) is actually just facet.

Agreed. Indeed, ‘production quality SPIR-V backend’ is vaguely defined here and we proposed one discussion point on this. For this proposal needs, we should focus on one subset of SPIR-V and one use-case (OpenCL). By production quality I mean that we can correctly produce the code for that subset of SPIR-V. I totally agree that having the ‘Full SPIR-V’ coverage is something very broad and probably not achievable at all - but we are not aiming at that. I do take the perspective of classical ‘CPU backend’ here: we do have to generate the ISA code for the input LLVM-IR code. Now, besides instructions, our backend needs to deduce proper capabilities and extensions, based on what subset of instructions is selected. As You pointed out later, plain LLVM-IR is not capable of describing the full SPIR-V. Some of decorations/extensions/capabilities might be deduced by the backend, while some need to be declared using various LLVM-IR concepts, such as metadata, attributes, intrinsics - and that needs a clear definition.

My understanding over LLVM is it’s mostly focusing on 2): we have a very coherent single IR threading through the majority layers of the compiler stack and the IR focuses very much as a means for compiler transformations (i.e., no instruction versioning etc.). There isn’t much native modelling for most points for 1) and 3) (which makes sense as LLVM IR is a compiler IR). So to make it work, one would need to shorehore through existing LLVM mechanisms (e.g., using intrinsics for various GPU related builtins, using metadata for SPIR-V decorations?, etc.), unless we want to evolve LLVM infrastructure to have native support for the missing SPIR-V mechanisms, which I think might be too much to take on.

Also agree. I do believe though, that LLVM-IR is still worth the effort, and we can take an incremental approach into adopting some GPU concepts into LLVM-IR. (e.g. ‘convergent’ attribute has been added mainly for GPU kind of targets). The first step is defining the metadata and intrinsics that are target specific for SPIR-V, but most of them could be generalized as GPU concepts and even introduced into core LLVM-IR spec. I agree this is great effort - and not really the main focus point of this proposal - yet, we should give LLVM-IR it’s own right into the world of GPUs.

This is just general mechanisms, not mentioning the different semantics between different SPIR-V consumers (e.g., shader vs. kernel and what that means over memory/execution model, etc.) that needs to be sorted out too… Just supporting a certain use case of what SPIR-V supports is certainly simpler though as we can bake in assumptions and avoid some infrastructure needs for the full generality.

I would focus on just a subset and clearly define what input LLVM-IR GPU dialect would look like for that subset.

That’s why I think using MLIR as the infrastructure to build general support for SPIR-V is more preferable as we control everything there and can feel free to model all SPIR-V concepts in the most native way. For example we can feel free to define all SPIR-V ops natively, including all ops introduced by SPIR-V extensions and extended instruction sets. We can support versions/extensions/capabilities natively and integrate it with the target environment to automatically filter out CodeGen patterns generating ops not available on the target, etc. To me, MLIR’s open dialect/op/type/etc. system is a perfect fit for the open SPIR-V spec with many capabilities/extensions/etc. For example we can even make the SPIR-V dialect itself open to allow out-of-tree extensions and development and such.

Right. For the ‘General SPIR-V’ support, MLIR is the right abstraction level to use. And I would keep it that way. For the ‘specific’/legacy uses, backend is the way to fill that gap.

With that said, I understand that software development has many reality concerns (like existing codebase, familiarity with different components, etc.) and we have many different use cases, which may mean that different paths make sense. So please don’t take this as a negative feedback in general. It’s just that to me it’s unclear how we can unify here right now. Even when the time arrives for unification, I’d believe going through MLIR is better to have general SPIR-V support. :slight_smile:

A very good discussion! I seem to be overly optimistic at the first place at unifying those two approaches. Now I believe that we actually should have two paths, for the reasons You have just explained and for the reasons of supporting ‘legacy’ paths/compilers that rely on a classical, years old approach: Front-End → LLVM-IR (opt) → backend (llc). For that legacy path, a plain old ‘backend’ approach is still (in my view) the way to go. On the other hand, when MLIR evolves and gets wider adoption, it will be the way to go. From the semantic point of view, MLIR is much better suited for representing structured and extensible nature of SPIR-V. But for MLIR approach to be adopted, new languages/front-ends need to be aware of that structure, so to take most of the advantage of it. If Clang C/C++ start to use MLIR as its native generation format - that would be a big case for MLIR approach, but until that happens, we need to have some intermediate solution.

The most important use-case for the backend is the systems/languages that have been targeting x86, ARM or NVPTX, AMDGPU. You want to replace that particular backend with some other GPU backend (e.g. Intel GPU :slight_smile: ). So the solution is to use SPIR-V as the backend target, and then consume SPIR-V with a proprietary/open-source GPU finalizer that eventually produces GPU assembly. (We did not want to come up with yet another GPU intermediate language like PTX, HSAIL etc. and want to use Khronos standard SPIR-V for that purpose). In this use case, the stress is on points 1) and 2) and less on point 3) as You stated earlier.

So my proposal is to keep two paths. They are complementary to each other. I know of the maintenance cost concerns - but while there are use-cases for both, it is still worth it. When the worlds stops using LLVM-IR, it will die silently, so will die the SPIR-V backend - but that is a natural software lifecycle :wink:

I agree with this viewpoint. Having a backend like this (I hesitate to just call it a “Spir-v backend” because it is a more specific thing than that) would bring a level of utility and parity between GPU targets that seems like it has a lot of value. There may be a future convergence between the LLVM-IR and MLIR approaches, but I think we need to see and do more to get there. So long as there is enough community support to build and maintain this backend, it seems like a good thing that we want in repo, and having it developed there will help us see more options for unification over time.

I’m also personally supportive of “sweating it out” on this RFC/subsequent discussions and exploring technical options that we may have missed for better unification from an earlier point because these two parts of the community have been fairly isolated. Those are good conversations that we should have and continue. Unless if an obvious simplification emerges as part of that, by default though, I would support moving forward with this proposal once the details are converged.

It’d be quite a good outcome, imo, to exit these discussions with a sketch of a plan for how these things could converge in the future, but I think we will find that future to be a ways out and require more practical first steps.

There’s definitely some consensus or even roadmap/timeline on this transition missing IMO :slight_smile: And pls forgive me my possibly stupid question, but is there any way now to conveniently incorporate MLIR flow for projects which are based on a good old clang->llvm->mir->machinecode way? I understand we have ‘llvm’ dialect and may recall last year there was a talk about the common C/C++ dialect, but it isn’t public yet, is it?

Not that I am aware of, but I haven’t followed the most recent development either! We’re very interested into looking into this though :slight_smile:

FYI there was this thread about CIL last month
https://lists.llvm.org/pipermail/cfe-dev/2021-February/067654.html.
It didn’t get a lot of traction yet but I think
this topic could become more interesting to the community in the future. I am not
sure the IR generation in Clang can be completely replaced but CIL could certainly
start as an experimental alternative format that Clang could emit. I do believe it will
take a while until it will catch up with the quality and functionality that IR generation
provides.

Speaking on behalf of the OpenCL community that will greatly benefit from the

SPIR-V generation for OpenCL kernel languages in the LLVM project, having an
LLVM backend would improve the user experience and facilitate many other
improvements in Clang and LLVM for the OpenCL devices that are now blocked on
the unavailability of a common target that vendors without in-tree LLVM backend can
contribute to. The backend will also be the easiest way to integrate with the Clang
frontend because it is a conventional route.

I do acknowledge that MLIR can provide many benefits to the OpenCL community,
however, on the Clang side a different approach than what is available right now
would probably make more sense. It would be better to go via CIL i.e. Clang
would emit CIL flavor of MLIR that would then get converted to the SPIR-V flavor
of MLIR bypassing the LLVM IR. This would be a preferable route but in order to
support CIL for OpenCL we would need to have the support for C/C++ first as it
is just a thin layer on top of the core languages that Clang supports. So this
is not something we can target at the moment because we will likely depend on
the will and the time investment from the C/C++ community for that.

Cheers,
Anastasia