[RFC] Upstreaming LLVM/SPIR-V converter

Khronos Group SPIR WG is working on a bi-way converter between LLVM bitcode and SPIR-V (https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.pdf ) binary and is willing to upstream it to the LLVM project.

The bi-way converter uses a common in-memory representation of SPIR-V. It works by breaking down a module to instructions and construct the translated module in memory then output it. Currently it supports SPIR-V common instructions and OpenCL specific instructions. Supporting of other languages is under consideration.

We plan to refactor the LLVM to SPIR-V converter as a backend at llvm/lib/Target/SPIRV to allow Clang targeting SPIR-V. Since this will result in an unconventional backend which does not use SelectionDAG/MC, we would like to know whether it is acceptable. We are open to the SelectionDAG/MC approach if the community recommends it.

For the SPIR-V to LLVM converter, we are seeking suggestions on its proper location in the LLVM project.

Any comments are welcome. Thanks.

Yaxun Liu
AMD

Khronos Group SPIR WG is working on a bi-way converter between LLVM bitcode and SPIR-V (https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.pdf ) binary and is willing to upstream it to the LLVM project.

The bi-way converter uses a common in-memory representation of SPIR-V. It works by breaking down a module to instructions and construct the translated module in memory then output it. Currently it supports SPIR-V common instructions and OpenCL specific instructions. Supporting of other languages is under consideration.

We plan to refactor the LLVM to SPIR-V converter as a backend at llvm/lib/Target/SPIRV to allow Clang targeting SPIR-V. Since this will result in an unconventional backend which does not use SelectionDAG/MC, we would like to know whether it is acceptable. We are open to the SelectionDAG/MC approach if the community recommends it.

I believe that the ‘how to write a backend’ documentation recommends against using SelectionDAG for generating code for other virtual instruction sets. I don’t think that there’s any benefit to forcing a back end to use the generic infrastructure, unless it makes sense for that back end to do so.

For the SPIR-V to LLVM converter, we are seeking suggestions on its proper location in the LLVM project.

To me, this is no different from any other front end, so should probably live in a separate repository (though ideally an LLVM-hosted one that is integrated with buildbots and kept up to date).

David

Honestly, SPIR-V seems a little bit more like a quirky program
serialization format. It's not a source language or an ISA. It might be
better to treat it like the bitcode reader/writer and have both in one
place. Something like lib/Target/SPIRV/(Writer|Reader)?

Hey Reid,

(Donning my Khronos hat here) - it would make sense to keep the code together within a SPIRV backend, in that there are many helper constructs shared by both the reader/writer - we realised though that this would be a non-standard thing to do in terms of LLVM as it stands (EG. we have a backend that also has code that can consume SPIR-V and spit out LLVM IR!), so I am happy that you have suggested it :slight_smile:

-Neil.

+1 to lib/Target/SPIRV/(Reader|Writer)

I really like this idea. I’ve talked with some people on both the LLVM and Khronos sides and I really think adding SPIR-V support to LLVM as an optional program serialization format would be fantastic. I think it would make it even easier for LLVM-based tools to be integrated into GPU authoring and execution pipelines.

I’m really excited to see this moving forward!

-Chris

Agreed. Unlike HSAIL, SPIR-V looks like a very sane design and having an in-tree serialisation format that’s stable across LLVM versions is likely to make SPIR-V useful for a number of things outside of its initial scope.

David

I want to point out that being stable across LLVM versions means that LLVM
will be able to express things that SPIRV cannot. Any new IR feature we add
to LLVM may not be expressable in SPIRV. EH landingpads, for example, are
probably not supported by SPIRV and must be stripped or rejected.

So, the process of writing SPIRV will strip, lower, or reject some LLVM IR
bits. However, roundtripping IR through SPIRV should probably be
idempotent. The first translation will change the program, but repeating
the roundtrip should produce the same IR and SPIRV from then on. If that's
achievable, then I agree, this is definitely a serialization format and not
something lower-level (x86) or higher-level (C).

Sounds useful. :slight_smile:

I’m not that familiar with SPIRV, but if it really is a serialization format, then why isn’t it a parallel to llvm/lib/Bitcode?

-Chris

+1 livm/lib/SPIRV like llvm/lib/Bitcode and addition of llvm
instrinsics for SPIR-V.

Thanks all for the very helpful suggestions.

In a sense, SPIR-V is like an alternative binary format for LLVM, since an LLVM bitcode can be converted to SPIR-V and vice versa.

On the other hand, SPIR-V is like a target, since it can be consumed by OpenCL and Vulkan platform.

I am thinking maybe the functionality of the bi-way conversion can be kept at llvm/lib/Bitcode/SPIRV, which will facilitate OpenCL vendors to do conversions between LLVM and SPIR-V. On the other hand, we create a llvm/Target/SPIR-V, which uses llvm/lib/Bitcode/SPIRV to generate SPIR-V. The SPIR-V target allows Clang and other LLVM front ends to target generic OpenCL/Vulkan platforms.

Sam

it looks like Clang and other LLVM front ends generate LLVM IR for
SPIR-V and llvm/Target/SPIR-V backend consumes it. I wonder whether
SPIR-V needs its own optimization or something like that? If SPIR-V
needs it, +1 llvm/Target/SPIR-V. but If not, why not directly use
llvm/lib/Bitcode/SPIRV to write SPIR-V binary from front ends?

I don’t think Chris was suggesting lib/Bitcode/SPIRV. That wouldn’t make a lot of sense, since Bitcode != SPIR-V.

FWIW, I agree with Chris that it makes sense as a parallel to lib/Bitcode. SPIR-V is (almost) an alternate encoding of LLVM IR, with a few things added/removed, and it makes sense to treat it in a similar manner to our normal serialization format.

—Owen

I agree :slight_smile:

I also think it should be something like lib/SPIRV .
And like Chris B. I’m also very interested to see it in! :slight_smile:

Marcello

I'd like to add that SPIR-V also defines data types and operations
which have no native representation in LLVM IR. For instance, Clang
generates opaque structure type to represent OpenCL C image types. It
might be defined differently by Vulkan compiler, OpenCL C++ compiler
or some other LLVM-based compiler front-end.

The same is true for the extended instruction set: e.g. SPIR-V has sin
instruction intended to represent 'sin' built-in function, which is
implemented in LLVM IR as a function call:

call float _Z3sinf (float ) ;OpenCL C or GLSL - sin must be overloadable.
call float _ZN2cl3sinEf (float ) ; OpenCL C++ - 'sin' must be in cl namespace.
call float sinf (float ) ; some C-like language that do not require
function overloading

So we will either need some adapters that will prepare LLVM IR for
SPIR-V serialization or define 'standard' LLVM IR constructs to
represent SPIR-V types and operations, which compiler front-ends will
generate.
Does it make sense to utilize LLVM intrinsics for SPIR-V extended instructions?

I feels SPIR-V doesn't like to be a backend, cause it's can be
translate to different target(Such as NPTX and R600 or other GPU)
But SPIR-V is not llvm IR, cause SPIR-V is have something llvm-ir
doesn't have and removed something from llvm-ir.

So I think SPIR-V is another IR that seat between the clang-frontend
and llvm-IR.
Like C# IR or Java ByteCodes does.
So clang -> SPIR-V -> LLVM-IR -> Backends.
So

                     C# MSIL
Front End- Java Byte Codes LLVM-IR Backends.
                    SPIR-V
                     Other Byte-Codes

I have no objection to llvm/lib/SPIRV if the community recommends it.

For OpenCL types and builtin functions, the converter currently assumes the LLVM bitcode follows the SPIR 1.2/2.0 metadata format and IA64 name mangling scheme. I see the benefits of using intrinsics to represent OpenCL builtin functions, but we need to sort out some implementation details and the potential impact on OpenCL builtin library.

Sam

I am thinking maybe the functionality of the bi-way conversion can be kept
at llvm/lib/Bitcode/SPIRV, which will facilitate OpenCL vendors to do
conversions between LLVM and SPIR-V. On the other hand, we create a
llvm/Target/SPIR-V, which uses llvm/lib/Bitcode/SPIRV to generate SPIR-V.
The SPIR-V target allows Clang and other LLVM front ends to target generic
OpenCL/Vulkan platforms.

I don’t think Chris was suggesting lib/Bitcode/SPIRV. That wouldn’t make
a lot of sense, since Bitcode != SPIR-V.

FWIW, I agree with Chris that it makes sense as a parallel to
lib/Bitcode. SPIR-V is (almost) an alternate encoding of LLVM IR, with a
few things added/removed, and it makes sense to treat it in a similar
manner to our normal serialization format.

From an end-user's perspective it sounds like the use case for SPIR-V

though is a lot more similar to a target though. E.g. the user is
notionally telling clang "target SPIR-V" (including doing any IR
optimizations, any special "codegenprepare" special passes, etc.), rather
than "act like you're targeting X, but -emit-llvm/-emit-spirv instead"
(which is what I imagine from a component purely in lib/SPIRV).

So it sounds like having some "target-like" component for SPIR-V will be
necessary to ease integration with regular clang flow. It might be that we
have generic SPIR-V support factored out into lib/SPIRV, and then
lib/Target/SPIRV for the logic that integrates with clang.

-- Sean Silva

I agree. I haven’t looked in to the SPIRV spec, but if its only intended to support GPU graphics/compute workloads then i doubt it supports everything the IR bitcode/ll code supports. That makes it feel more like a target to me.

That is, is it appropriate to add lib/SPIRV as a serialization format when it doesn’t support all of LLVM IR? Targets can crash if they are fed code they don’t support, but I don’t like the sound of the file emitter crashing because the IR contained a function pointer.

BTW, i’m not arguing against SPIRV in LLVM. I think it would be a great addition, just that we should be careful in making sure that everyone is ok with it not supporting all IR constructs but still being a first class serialization format.

Cheers,
Pete

罗勇刚(Yonggang> I feels SPIR-V doesn't like to be a backend, cause
    罗勇刚(Yonggang> it's can be translate to different target(Such as
    罗勇刚(Yonggang> NPTX and R600 or other GPU) But SPIR-V is not llvm
    罗勇刚(Yonggang> IR, cause SPIR-V is have something llvm-ir doesn't
    罗勇刚(Yonggang> have and removed something from llvm-ir.

SPIR-V is also a target because its goal is to have a portable .o in the
Khronos realm (OpenCL, OpenGL...).

The NVPTX <-> LLVM IR is interesting too. Combined with LLVM IR <-> SPIR V,
it opens some easier gateways between OpenCL and CUDA for example... :slight_smile:

    罗勇刚(Yonggang> So I think SPIR-V is another IR that seat between
    罗勇刚(Yonggang> the clang-frontend and llvm-IR. Like C# IR or Java
    罗勇刚(Yonggang> ByteCodes does. So clang -> SPIR-V -> LLVM-IR ->
    罗勇刚(Yonggang> Backends. So

Even if we could imagine implementing it this way, it would unlikely
done this way because it would mean duplicating a lot of code in Clang.

So it is more likely Clang -> LLVM-IR <-> SPIR-V

SPIR-V is a serialization format between the user’s frontend and the vendor’s backend. From the user’s perspective, it looks like a target. From the vendor’s perspective, it looks like a frontend. In this sense, it is very comparable to LLVM bitcode itself.

—Owen