[RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi all,

HIP is a C++ Runtime API and kernel language that allows developers to
create portable applications for AMD and NVIDIA GPUs from a single
source code [0]. There are also projects for running HIP code on Intel
GPU platforms via the Intel Level Zero API [1] called HIPLZ [3] and
HIPCL [2], which runs HIP programs in OpenCL devices with certain
advanced features supported. Both of these backends consume SPIR-V
binaries.

We are proposing a patch set to be upstreamed that enables SPIR-V
emission through the HIP code path. The end goal of the patches to be
submitted is to emit SPIR-V binaries from HIP device code so it can be
embedded into executables for OpenCL-like environments (at least for
starters). Our current focus is on the two above-mentioned projects,
HIPCL and HIPLZ which are both work-in-progress HIP
implementations. They itself do not consume SPIR-V, but the device
binaries are handed over to the OpenCL and Intel Level Zero APIs,
respectively.

Coarsely, the current process of translating the HIP code to SPIR-V in
LLVM/Clang involves:

* Retargeting HIP device code generation to the SPIR-V target.
* Mapping address spaces in HIP to corresponding ones in SPIR-V.
* Expanding HIP features, which can not be directly modeled in SPIR-V
  (e.g. dynamic shared memory).

The HIPSPIRV experimental branch is available at [4]. Note that it is
not yet in a state we intend to propose for upstreaming, but shaping
up the patches is a work in progress. Before proceeding to shape up
and submit the patches, we would like to get feedback for the plans we
have for upstreaming. In the following sections, we open up the above
points further and sketch our plans for changes to LLVM (mostly to the
Clang tool) to achieve the goal.

Retargeting device codegen

Hi Henry,

The HIPSPIRV experimental branch is available at [4]. Note that it is not yet in a state we intend to propose for upstreaming, but shaping up the patches is a work in progress. Before proceeding to shape up and submit the patches, we would like to get feedback for the plans we have for upstreaming. In the following sections, we open up the above points further and sketch our plans for changes to LLVM (mostly to the Clang tool) to achieve the goal.

Unfortunately I get 404 when I try access https://github.com/parmance/llvm-project/tree/hip2spirv-v5. Your llvm-project fork seems to be private.

Anyway, I see a lot of items listed in the email overlap with the SYCL implementation, which also targets SPIR-V execution environments. Even if we can't re-use the code as-is the approach for compiling GPU code to SPIR-V target can be common.

* Implement a new SPIRVTargetInfo and fill it with necessary
  information. For HIPCL/-LZ we are planning to adjust the address
  space mapping in a way which is discussed later in the ‘address
  space mapping’ section.

I'm not sure if new TargetInfo class is needed for that or it can achieved by adding HIP specific mapping in `adjust` method overload similar to how it's done for SYCL mode (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/SPIR.h#L138).

* Introduce ’hipspv’ as an OS or environment type in Triple. The
primary and the current use of the type is to select device offload
toolchain for HIPCL/-LZ.

Can't we use HIPCL toolchain if `spir` or `spirv64` target is set? We have a new environment type for SYCL, but now I’m trying to remove it as there are no cases where it can't be avoided. More details are in this issue [1] and pull request if you are interested [2].

Since the SPIR-V BE might not land in LLVM soon, we will set up the compilation flow
to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
which is used in our experimental branch.

+1. SYCL uses the same compilation flow. We have a patch adding declarations of SPIR-V built-ins compatible with LLVM-SPIRV-Translator, which follows the approach taken OpenCL built-ins and AFAIK is used by SPIR-V backend as well. I can upload it to Phabricator for review if needed.

HIP code expansion

Downstream SYCL implementation [3] supports all the features listed in this section except DSM allocations + quite a few other feature specific to SYCL and we are also doing post-link processing before SPIR-V translation. It would be very interesting to see how we can merge our implementations.

Thanks,
Alexey

[1] https://github.com/intel/llvm/issues/3534
[2] https://github.com/intel/llvm/pull/3929
[3] https://github.com/intel/llvm/tree/sycl

Hi Bader,

Hi Henry,

> The HIPSPIRV experimental branch is available at [4]. Note that it is not yet in a state we intend to propose for upstreaming, but shaping up the patches is a work in progress. Before proceeding to shape up and submit the patches, we would like to get feedback for the plans we have for upstreaming. In the following sections, we open up the above points further and sketch our plans for changes to LLVM (mostly to the Clang tool) to achieve the goal.

Unfortunately I get 404 when I try access https://github.com/parmance/llvm-project/tree/hip2spirv-v5. Your llvm-project fork seems to be private.

Ah, sorry - the repo is now public. Also, we will soon start to
sending upstream-reshaped patches to the phabricator for review.

Anyway, I see a lot of items listed in the email overlap with the SYCL implementation, which also targets SPIR-V execution environments. Even if we can't re-use the code as-is the approach for compiling GPU code to SPIR-V target can be common.

> * Implement a new SPIRVTargetInfo and fill it with necessary
> information. For HIPCL/-LZ we are planning to adjust the address
> space mapping in a way which is discussed later in the ‘address
> space mapping’ section.

I'm not sure if new TargetInfo class is needed for that or it can achieved by adding HIP specific mapping in `adjust` method overload similar to how it's done for SYCL mode (https://github.com/llvm/llvm-project/blob/main/clang/lib/Basic/Targets/SPIR.h#L138).

Yeah, It seems there is no need to have separate TargetInfo for SPIR-V
because there is not going to be much difference between SPIR and
SPIR-V target info. We also alter the mapping by overloading the
adjust method.

> * Introduce ’hipspv’ as an OS or environment type in Triple. The
> primary and the current use of the type is to select device offload
> toolchain for HIPCL/-LZ.

Can't we use HIPCL toolchain if `spir` or `spirv64` target is set? We have a new environment type for SYCL, but now I’m trying to remove it as there are no cases where it can't be avoided. More details are in this issue [1] and pull request if you are interested [2].

We can drop the ‘hipspv’ component for starters. The initial thought
was we need it to select the HIPSPVToolChain but there is another
approach in sight already that does not need the component.

However, we might need the component in the future: I’m thinking of
the upstreaming of HIP expander passes which may need to be run only
if we are targeting the HIPCL/-LZ environment. But perhaps we can
leave another kind of mark in the bitcode for the passes.

Also, we will soon start to sending upstream-reshaped patches to the phabricator for review.

Please, add me (@bader) to those review requests.

Will do.

[AMD Public Use]

+ Artem for awareness and comments.

Overall I support this proposal.

My comments are below.

Thanks.

Sam

[AMD Public Use]

+ Artem for awareness and comments.

Overall I support this proposal.

Same here. Since I want to reuse generic parts for OpenMP offloading we should keep
the non-HIP parts generic :slight_smile:

~ Johannes

Hi Henry,

Since the SPIR-V BE might not land in LLVM soon, we will set up the compilation flow
to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
which is used in our experimental branch.

Can you provide more details regarding this? Do you plan to integrate the
translator as an external tool?

Overall, there seem to be a huge overlap with what we need for OpenCL so it would
be good to make sure we are aligned and the new functionality is reusable for OpenCL
too.

Cheers,
Anastasia

Hi Anastasia,

Hi Henry,

> Since the SPIR-V BE might not land in LLVM soon, we will set up the compilation flow
> to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
> which is used in our experimental branch.

Can you provide more details regarding this? Do you plan to integrate the
translator as an external tool?

The intention is to use the SPIRV-LLVM translator as a tool outside
LLVM: either the tool is found in PATH or an error is emitted instead.
Since I’m assuming that the new SPIR-V BE will eventually land on LLVM
and supersede the translator, I don’t see much motivation for cleaner
integration of it to the LLVM project.

Overall, there seem to be a huge overlap with what we need for OpenCL so it would
be good to make sure we are aligned and the new functionality is reusable for OpenCL
too.

Sure. I’m not so familiar with the OpenCL infra in the LLVM currently,
so can you elaborate on any major overlap that OpenCL and HIPSPV have?
We are planning to start submitting patches for review, perhaps it’s
easier to point the overlaps on per-patch basis in the review system
then?

Hi Henry,

Just to provide some background - we had a discussion about the integration
of SPIRV-LLVM Translator some years back [1] and here is the design of our
user interface that has been proposed at that time [2]. I appreciate we might
not be able to unify the interfaces completely but it would make sense to
provide common mechanisms for different languages to use SPIR-V even if it
might not be achievable straight away we should aim for it as an end goal.

Considering that there seems to be a lot of interest in this from different
language communities, can we clarify the full plan? My understanding is that
you are proposing to add the translator temporarily and it will be replaced by
the backend in the future. How do you envision this transition? Do you plan to
provide command-line options for the translator to be used that would be
deprecated at some point later or would they be added as temporary from the
start?

Another consideration is that we have invested quite a lot of effort in the
alternative approach i.e using the SPIR-V backend because this was highlighted
as the best viable approach for SPIR-V support in Clang and LLVM when we
had our discussion some years back. The situation is likely different now and
your proposal isn�t identical, also we haven�t made a lot of progress with the
backend yet. However, the integration of alternative SPIR-V translation might
negatively impact the adoption of the backend. It might also result in either
suboptimal design flow or code duplication in Clang. For example, we might need
to redesign the OpenCL builtins representation and mapping to SPIR-V
instructions.

Considering that we might not be too far from integrating the backend into the
LLVM, would it be reasonable to synchronize with the backend developers and
see if the backend could be used straight away? I am looping in Konrad here
who has been discussing the backend integration earlier this year [3]. Perhaps
he can provide some insights about the backend work and the timeline for it.
Maybe you could start working on some parts that are not related to IR
consumption first and then add the SPIR-V emission later on and hopefully, the
timing can align with the backend work too.

[1] [llvm-dev] [RfC] A proposal of adding SPIR-V Toolchain in Clang
[2] https://github.com/KhronosGroup/SPIRV-LLVM-Translator/wiki/SPIRV-Toolchain-for-Clang
[3] https://lists.llvm.org/pipermail/llvm-dev/2021-March/148905.html

Cheers,
Anastasia

Hi Anastasia,

Hi Henry,

Just to provide some background - we had a discussion about the integration
of SPIRV-LLVM Translator some years back [1] and here is the design of our
user interface that has been proposed at that time [2]. I appreciate we might
not be able to unify the interfaces completely but it would make sense to
provide common mechanisms for different languages to use SPIR-V even if it
might not be achievable straight away we should aim for it as an end goal.

Considering that there seems to be a lot of interest in this from different
language communities, can we clarify the full plan? My understanding is that
you are proposing to add the translator temporarily and it will be replaced by
the backend in the future. How do you envision this transition? Do you plan to
provide command-line options for the translator to be used that would be
deprecated at some point later or would they be added as temporary from the
start?

Our planned HIP-SPIR-V tool chain calls the command-line tool
‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then embedded
in the HIP binary as a byte array. The call to the llvm-spirv tool is
used as a temporary solution until the SPIR-V backend lands in the
LLVM code base in the future. When the LLVM SPIR-V backend is usable
in the upstream repo, we plan to simply switch to calling the LLVM’s
internal ‘llc’ code generator tool for generating the SPIR-V instead
of llvm-spirv. Can you immediately spot problems with that approach?

We believe this is the best solution to integrate with the toolchain
infrastructure. Another would be to call the code generator at LLVM
API level, but it seems out of place for the toolchain framework: The
SPIR-V code generation path is not exposed so that clang frontends
could use it to emit SPIR-V for themselves.

Another consideration is that we have invested quite a lot of effort in the
alternative approach i.e using the SPIR-V backend because this was highlighted
as the best viable approach for SPIR-V support in Clang and LLVM when we
had our discussion some years back. The situation is likely different now and
your proposal isn�t identical, also we haven�t made a lot of progress with the
backend yet. However, the integration of alternative SPIR-V translation might
negatively impact the adoption of the backend. It might also result in either
suboptimal design flow or code duplication in Clang. For example, we might need
to redesign the OpenCL builtins representation and mapping to SPIR-V
instructions.

Considering that we might not be too far from integrating the backend into the
LLVM, would it be reasonable to synchronize with the backend developers and
see if the backend could be used straight away? I am looping in Konrad here
who has been discussing the backend integration earlier this year [3]. Perhaps
he can provide some insights about the backend work and the timeline for it.
Maybe you could start working on some parts that are not related to IR
consumption first and then add the SPIR-V emission later on and hopefully, the
timing can align with the backend work too.

[1] [llvm-dev] [RfC] A proposal of adding SPIR-V Toolchain in Clang
[2] https://github.com/KhronosGroup/SPIRV-LLVM-Translator/wiki/SPIRV-Toolchain-for-Clang
[3] [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend

Cheers,
Anastasia
________________________________
From: Henry Linjam�ki <henry.linjamaki@parmance.com>
Sent: 17 August 2021 17:16
To: Anastasia Stulova <Anastasia.Stulova@arm.com>
Cc: cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org>; llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; yaxun.liu@amd.com <yaxun.liu@amd.com>
Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi Anastasia,

>
> Hi Henry,
>
> > Since the SPIR-V BE might not land in LLVM soon, we will set up the compilation flow
> > to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
> > which is used in our experimental branch.
>
> Can you provide more details regarding this? Do you plan to integrate the
> translator as an external tool?
>
The intention is to use the SPIRV-LLVM translator as a tool outside
LLVM: either the tool is found in PATH or an error is emitted instead.
Since I�m assuming that the new SPIR-V BE will eventually land on LLVM
and supersede the translator, I don�t see much motivation for cleaner
integration of it to the LLVM project.

> Overall, there seem to be a huge overlap with what we need for OpenCL so it would
> be good to make sure we are aligned and the new functionality is reusable for OpenCL
> too.
>
Sure. I�m not so familiar with the OpenCL infra in the LLVM currently,
so can you elaborate on any major overlap that OpenCL and HIPSPV have?
We are planning to start submitting patches for review, perhaps it�s
easier to point the overlaps on per-patch basis in the review system
then?

> Cheers,
> Anastasia
>
>
> ________________________________
> From: llvm-dev <llvm-dev-bounces@lists.llvm.org> on behalf of Henry Linjam�ki via llvm-dev <llvm-dev@lists.llvm.org>
> Sent: 09 August 2021 07:57
> To: cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org>
> Cc: llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; yaxun.liu@amd.com <yaxun.liu@amd.com>
> Subject: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V
>
> Hi all,
>
> HIP is a C++ Runtime API and kernel language that allows developers to
> create portable applications for AMD and NVIDIA GPUs from a single
> source code [0]. There are also projects for running HIP code on Intel
> GPU platforms via the Intel Level Zero API [1] called HIPLZ [3] and
> HIPCL [2], which runs HIP programs in OpenCL devices with certain
> advanced features supported. Both of these backends consume SPIR-V
> binaries.
>
> We are proposing a patch set to be upstreamed that enables SPIR-V
> emission through the HIP code path. The end goal of the patches to be
> submitted is to emit SPIR-V binaries from HIP device code so it can be
> embedded into executables for OpenCL-like environments (at least for
> starters). Our current focus is on the two above-mentioned projects,
> HIPCL and HIPLZ which are both work-in-progress HIP
> implementations. They itself do not consume SPIR-V, but the device
> binaries are handed over to the OpenCL and Intel Level Zero APIs,
> respectively.
>
> Coarsely, the current process of translating the HIP code to SPIR-V in
> LLVM/Clang involves:
>
> * Retargeting HIP device code generation to the SPIR-V target.
> * Mapping address spaces in HIP to corresponding ones in SPIR-V.
> * Expanding HIP features, which can not be directly modeled in SPIR-V
> (e.g. dynamic shared memory).
>
> The HIPSPIRV experimental branch is available at [4]. Note that it is
> not yet in a state we intend to propose for upstreaming, but shaping
> up the patches is a work in progress. Before proceeding to shape up
> and submit the patches, we would like to get feedback for the plans we
> have for upstreaming. In the following sections, we open up the above
> points further and sketch our plans for changes to LLVM (mostly to the
> Clang tool) to achieve the goal.
>
> Retargeting device codegen
> ==========================
>
> For making the HIP toolchain to emit and embed SPIR-V we are
> tentatively planning the following changes to the LLVM/Clang:
>
> * Introduce, at minimum, a 'spirv64' architecture type in Triple. This
> is what the SPIR-V backend [5] (SPIR-V BE) effort is planning to
> upstream. We would like to upstream this change in advance to
> specify the HIP SPIR-V device code target, potentially before the
> SPIR-V BE work lands.
>
> * Implement a new SPIRVTargetInfo and fill it with necessary
> information. For HIPCL/-LZ we are planning to adjust the address
> space mapping in a way which is discussed later in the �address
> space mapping� section.
>
> * Introduce a clang option to override the HIP device code target. We
> are interested in the option �--offload=<target>� discussed in the
> 'Unified offload option for CUDA/HIP/OpenMP'-thread [6]. This option
> would suit this use case well. As far as we know, the subject has
> not advanced further from the discussion - is anyone working on it?
>
> * Compilation driver:
>
> HIP offload builder is changed to retrieve the offload device target
> from the --offload option. If it is not present, it can fall back to
> AMD's default target for avoiding changing the current default HIP
> compilation behavior.
>
> Temporarily change Driver to force clang to emit LLVM bitcode for
> SPIR-V targets in the backend compilation phase. Otherwise, the
> compilation will fail due to the lack of the real SPIR-V BE in many
> parts of the code. Reworked HIPToolChain takes care of translating
> the bitcode to SPIR-V during the linking phase. When the SPIR-V BE
> lands in LLVM, we can revert this change.
>
> * Introduce �hipspv� as an OS or environment type in Triple. The
> primary and the current use of the type is to select device offload
> toolchain for HIPCL/-LZ.
>
> * Implement a new toolchain class 'HIPSPVToolChain' in clang which is
> selected when the HIP device target is specified to be
> �spirv64-unknown-hipspv� with the --offload option. Since the SPIR-V
> BE might not land in LLVM soon, we will set up the compilation flow
> to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
> which is used in our experimental branch.
>
> One important thing the toolchain does is to run one or several LLVM
> IR passes, which are needed by the HIPCL/LZ runtime, on the final
> fully linked device bitcode. The passes are required to be run
> during link time - all user specified device code and HIPCL/LZ
> device library routines have to be visible when the passes are
> run. The reason for the requirement is explained in the 'HIP code
> expansion' section. HIPSPVToolChain will use the opt tool for
> running the passes at link time.
>
> * Currently, HIPToolChain is derived from ROCmToolchain and its long
> chain of super classes (AMDGPUToolChain, Generic_ELF and
> Generic_GCC). The new upstreamed target would not logically belong
> under the AMDGPU/ROCm family so it does not make sense to derive the
> HIPCL toolchain from the HIP toolchain. Therefore, we propose to:
>
> - Create a new base HIP tool chain, 'BaseHIPToolChain' or just
> 'HIPToolChain', derived directly from ToolChain and put any
> HIP-related code that is common or that can be reused in the
> derived toolchains there.
>
> - Derive a new HIPSPVToolChain from HIPToolChain.
>
> - Rebase the HIPToolChain under the HIPToolChain and rename it to
> HIPAMDToolChain. Since the current HIPToolChain depends on methods
> in the super classes (e.g. AMDGPUToolChain�s getParsedTargetID)
> the rebased class is planned to be a proxy class to avoid code
> duplication and to reduce the amount of changes. Another option to
> refactor the current HIPToolChain would be to use multiple
> heritance but that leads to dreaded diamond class structure which
> probably is not a great choice.
>
> With the current plan, HIPToolChain is not going to have much code
> to be shared with the derived classes - so far only a bit of the
> �fat binary� construction code is in sight for sharing, so the
> immediate gains for the effort seems small. However, The TC�s layout
> is more logical and it may spark more HIP implementations, as well
> as help refactoring when going forward.
>
>
> Address space mapping
> =====================
>
> Translating HIP device code to valid SPIR-V binary requires tweaks on
> pointers:
>
> Pointers without address space (AS) qualification in HIP programs are
> considered �flat� pointers - they can point to function local,
> __device__, __shared__ and __constant__ memory space dynamically,
> which matches the idea of �generic� pointers introduced in OpenCL
> 2.0. Therefore, the logical choice for the flat pointers is to map
> them to generic pointers of SPIR-V�s OpenCL environment. HIPCL�s and
> HIPLZ�s SPIR-V environment mandates that the kernel pointer parameters
> must point to __global, __local or __constant memory (these are named
> differently in SPIR-V; using OpenCL names as they are more
> familiar). So HIP pointer parameters in the HIP kernel (__global__)
> functions would be mapped to global pointers. Otherwise, HIP pointers
> with AS qualifiers are mapped to SPIR-V equivalent, if suitable.
>
> Now, there are significant differences between HIP�s __constant__ and
> SPIR-V/OpenCL�s constant address space:
>
> * In HIP, __constant__ globals can be altered on the host side with
> the hipMemcpyToSymbol() API function. In the OpenCL�s host API you
> cannot do this.
>
> (Side-note: OpenCL host API does not have an equivalent method for
> hipMemcpyToSymbol but HIPCL currently supports hipMemcpyToSymbol for
> the global __global variables via Intel�s
> clGetDeviceGlobalVariablePointerINTEL API extension, but we are
> planning to inject shadow kernel commands that access the global
> variables instead for portability.)
>
> * In HIP flat pointers can point to __constant__ memory. In OpenCL
> this is not the case with __generic pointers, which means __constant
> pointers cannot be casted to __generic pointers and vice versa.
>
> There are a couple ways to deal with constants:
>
> * Map __constant__ to __global space in SPIR-V. That way we can
> generate code that works and is simple to implement. Of course, we
> lose the optimization/placing benefits of constant memory.
>
> * Transform the code after clang codegen (by an LLVM pass) by
> converting the __constant objects to kernel arguments. This covers
> the hipMemcpyToSymbol() case. There is still the constant-to-generic
> cast issue, so we would have to use the previous point as the
> fallback.
>
> We plan to start by upstreaming the first option, and time permitting,
> improve by implementing the second option.
>
> The planned changes to Clang to achieve the aforementioned AS mapping
> are as follows:
>
> * Define address space mapping in the new, aforementioned
> SPIRVTargetInfo to map CUDA address spaces (which the HIP reuses) to
> do the mapping mentioned earlier. Default AS (0) used for the flat
> pointers are mapped to the SPIR-V�s �generic�. We intend this
> mapping being enabled when the language mode is HIP.
>
> * Change SPIRABIInfo to coerce kernel AS-unqualified pointer arguments
> to __global ones. Pointer arguments in regular device functions
> receive the __generic AS qualifier via the address space mapping
> defined in SPIRVTargetInfo in the above point.
>
>
> HIP code expansion
> ==================
>
> There are features in HIP language which do not have direct
> counterparts in SPIR-V�s OpenCL environment and those features need to
> be rewritten before translation to SPIR-V (in the future, lowering to
> SPIR-V machine code through the new BE). The non-exhaustive list of
> features that need to be expanded includes:
>
> * Dynamic shared memory allocation (DSM): It is an array which is
> declared globally in LLVM IR and its actual size determined at
> kernel launch. OpTypeRuntimeArray in SPIR-V is the closest thing to
> model this object, alas, it requires shader capability.
>
> * abort() builtin: No counterpart in SPIR-V/OpenCL.
> (Note: the behavior is not well specified in the HIP spec
> either. Assuming it terminates the whole grid if any work item
> reaches it. AMD�s abort definition calls __builtin_trap).
>
> * printf(): OpenCL�s printf takes the format string as �__constant__
> char*� while in HIP the format string does not have to reside in
> constant memory.
>
> * Texture objects. These roughly correspond to image and sampler
> objects of OpenCL combined. Also, texture objects carry more
> information for the texture functions than image+sampler objects do.
>
> * Texture references. Same as above but these are program global
> objects. In OpenCL, image objects cannot reside in the program
> global space.
>
> HIPCL/-LZ�s solution to the DSM allocation case is that the runtime
> allocates a shared buffer and passes it to the kernel as an additional
> argument (which is hidden from the user). The device code is modified
> so that the DSM object is replaced with the new kernel
> argument. Various other cases listed will be handled similarly:
>
> * For the printf case we tentatively replace the printf calls with a
> function that packs their arguments to an additional buffer passed
> as additional kernel argument and do the printing on the host side.
>
> * Texture objects will be tentatively split to image and sampler
> objects and possibly auxiliary struct to carry texture
> settings. This means at least that the kernel parameter listing
> needs to be rewritten for the Texture objects.
>
> * For the texture reference we tentatively planned replacing the
> global texture objects also with a number of additional kernel
> arguments.
>
> For this and other HIP features we need to apply LLVM IR passes to
> perform modifications on the device code. In many cases the passes
> should be run when the device code (as LLVM bitcode) is fully
> linked. This is simply achieved as the HIP offload mechanism already
> emits device code as LLVM bitcode in RDC mode (-fgpu-rdc), so during
> linking we do receive the device code as LLVM bitcode where to apply
> these expansions with full view of the device code.
>
> The current plan for implementing this is to make the HIPSPVToolChain
> to build a linker that uses llvm-link for linking device code, opt for
> running the IR passes needed and the external llvm-spirv tool (llc in
> the future when the SPIR-V BE lands) for emitting the SPIR-V
> binary. We load the passes from a path the user provides
> via --hip-link-pass-path (name pending) or automatically from HIP
> runtime�s installation location by using the search logic provided by
> ROCmInstallationDetector.
>
> There is interest in upstreaming the HIPCL/-LZ passes from the
> HIPCL/-LZ repositories in the future for reduced maintenance
> burden. However, we are not attempting to upstream them initially, as
> they are not yet completed and are subject to rapid changes. Question
> is: Where should the passes eventually be put in within the LLVM
> project tree? Could it be OK to add a new directory under Clang for
> tool chain passes?
>
>
> Testing
> =======
>
> We will provide llvm-lit tests for our toolchain in the upstream. We
> also want to add tests to make sure clang who will run the HIPCL/-LZ
> runtime passes get run at device code link time. For this we need a
> dummy pass plugin that the clang loads during the test.
>
> When the new LLVM SPIR-V BE work lands on LLVM, we will add SPIR-V
> assembly checks that are relevant for HIPSPV.
>
>
> References
> ==========
>
> [0]: HIP Programming Guide v4.5 — ROCm 4.5.0 documentation
> [1]: oneAPI Level Zero: 1.4.8 — Level Zero Specification documentation
> [2]: https://github.com/cpc/hipcl
> [3]: https://github.com/jz10/anl-gt-gpu
> [4]: https://github.com/parmance/llvm-project/tree/hip2spirv-v5
> [5]: https://github.com/KhronosGroup/LLVM-SPIRV-Backend
> [6]: [cfe-dev] [RFC] Unified offloading option for CUDA/HIP/OpenMP
> [7]: https://github.com/KhronosGroup/SPIRV-LLVM-Translator
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev@lists.llvm.org
> llvm-dev Info Page

--
BR,
Henry Linjam�ki

BR,
Henry and Pekka

Our planned HIP-SPIR-V tool chain calls the command-line tool
‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then embedded
in the HIP binary as a byte array. The call to the llvm-spirv tool is
used as a temporary solution until the SPIR-V backend lands in the?
LLVM code base in the future. When the LLVM SPIR-V backend is usable
in the upstream repo, we plan to simply switch to calling the LLVM’s
internal ‘llc’ code generator tool for generating the SPIR-V instead of llvm-spirv.
Can you immediately spot problems with that approach?

Since the translator is not part of the LLVM project do you plan to add any
command-line options to set its location or any other interface for this and
if so would this be advertised as temporary functionality? I imagine once

you switch to the backend such functionality would not be needed any
longer so I am wondering how do you see exactly the transition path…

And another aspect to clarify is the migration path i.e. what do you see as
criteria for changing to the backend and how soon could this happen?

Thanks,
Anastasia

> Our planned HIP-SPIR-V tool chain calls the command-line tool
> ‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then embedded
> in the HIP binary as a byte array. The call to the llvm-spirv tool is
> used as a temporary solution until the SPIR-V backend lands in the?
> LLVM code base in the future. When the LLVM SPIR-V backend is usable
> in the upstream repo, we plan to simply switch to calling the LLVM’s
> internal ‘llc’ code generator tool for generating the SPIR-V instead of llvm-spirv.
> Can you immediately spot problems with that approach?

Since the translator is not part of the LLVM project do you plan to add any
command-line options to set its location or any other interface for this and
if so would this be advertised as temporary functionality? I imagine once
you switch to the backend such functionality would not be needed any
longer so I am wondering how do you see exactly the transition path...

We believe it might be enough to locate the tool (llvm-spirv) in PATH
for now: A CLI option would be a nice addition, but perhaps not worth
it for a temporary solution - we expect the SPIR-V backend (and thus
llc) become usable sooner than later.

And another aspect to clarify is the migration path i.e. what do you see as
criteria for changing to the backend and how soon could this happen?

Calling the llvm-spirv in the PATH is not ideal: The tool’s version
might be too old or new with respect to the calling LLVM’s version
which might cause issues if there are incompatibilities with the LLVM
IR. So, we don’t want to keep using the tool longer than necessary,
and prefer to switch to the backend shipped with the LLVM installation
as soon as it lands upstream. I think the criterion for switching to
the BE instead of the llvm-spriv tool is when our internal HIP test
suite (which will be published soon) passes on the upstream SPIR-V BE.

Thanks,
Anastasia

________________________________
From: Henry Linjamäki <henry.linjamaki@parmance.com>
Sent: 23 August 2021 07:32
To: Anastasia Stulova <Anastasia.Stulova@arm.com>
Cc: cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org>; llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; yaxun.liu@amd.com <yaxun.liu@amd.com>; Trifunovic, Konrad <konrad.trifunovic@intel.com>
Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V

Hi Anastasia,

>
> Hi Henry,
>
> Just to provide some background - we had a discussion about the integration
> of SPIRV-LLVM Translator some years back [1] and here is the design of our
> user interface that has been proposed at that time [2]. I appreciate we might
> not be able to unify the interfaces completely but it would make sense to
> provide common mechanisms for different languages to use SPIR-V even if it
> might not be achievable straight away we should aim for it as an end goal.
>
> Considering that there seems to be a lot of interest in this from different
> language communities, can we clarify the full plan? My understanding is that
> you are proposing to add the translator temporarily and it will be replaced by
> the backend in the future. How do you envision this transition? Do you plan to
> provide command-line options for the translator to be used that would be
> deprecated at some point later or would they be added as temporary from the
> start?

Our planned HIP-SPIR-V tool chain calls the command-line tool
‘llvm-spirv’ for translating LLVM IR to SPIR-V, which is then embedded
in the HIP binary as a byte array. The call to the llvm-spirv tool is
used as a temporary solution until the SPIR-V backend lands in the
LLVM code base in the future. When the LLVM SPIR-V backend is usable
in the upstream repo, we plan to simply switch to calling the LLVM’s
internal ‘llc’ code generator tool for generating the SPIR-V instead
of llvm-spirv. Can you immediately spot problems with that approach?

We believe this is the best solution to integrate with the toolchain
infrastructure. Another would be to call the code generator at LLVM
API level, but it seems out of place for the toolchain framework: The
SPIR-V code generation path is not exposed so that clang frontends
could use it to emit SPIR-V for themselves.

>
> Another consideration is that we have invested quite a lot of effort in the
> alternative approach i.e using the SPIR-V backend because this was highlighted
> as the best viable approach for SPIR-V support in Clang and LLVM when we
> had our discussion some years back. The situation is likely different now and
> your proposal isn�t identical, also we haven�t made a lot of progress with the
> backend yet. However, the integration of alternative SPIR-V translation might
> negatively impact the adoption of the backend. It might also result in either
> suboptimal design flow or code duplication in Clang. For example, we might need
> to redesign the OpenCL builtins representation and mapping to SPIR-V
> instructions.
>
> Considering that we might not be too far from integrating the backend into the
> LLVM, would it be reasonable to synchronize with the backend developers and
> see if the backend could be used straight away? I am looping in Konrad here
> who has been discussing the backend integration earlier this year [3]. Perhaps
> he can provide some insights about the backend work and the timeline for it.
> Maybe you could start working on some parts that are not related to IR
> consumption first and then add the SPIR-V emission later on and hopefully, the
> timing can align with the backend work too.
>
> [1] [llvm-dev] [RfC] A proposal of adding SPIR-V Toolchain in Clang
> [2] https://github.com/KhronosGroup/SPIRV-LLVM-Translator/wiki/SPIRV-Toolchain-for-Clang
> [3] [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend
>
> Cheers,
> Anastasia
> ________________________________
> From: Henry Linjam�ki <henry.linjamaki@parmance.com>
> Sent: 17 August 2021 17:16
> To: Anastasia Stulova <Anastasia.Stulova@arm.com>
> Cc: cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org>; llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; yaxun.liu@amd.com <yaxun.liu@amd.com>
> Subject: Re: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V
>
>
>
> Hi Anastasia,
>
> >
> > Hi Henry,
> >
> > > Since the SPIR-V BE might not land in LLVM soon, we will set up the compilation flow
> > > to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
> > > which is used in our experimental branch.
> >
> > Can you provide more details regarding this? Do you plan to integrate the
> > translator as an external tool?
> >
> The intention is to use the SPIRV-LLVM translator as a tool outside
> LLVM: either the tool is found in PATH or an error is emitted instead.
> Since I�m assuming that the new SPIR-V BE will eventually land on LLVM
> and supersede the translator, I don�t see much motivation for cleaner
> integration of it to the LLVM project.
>
> > Overall, there seem to be a huge overlap with what we need for OpenCL so it would
> > be good to make sure we are aligned and the new functionality is reusable for OpenCL
> > too.
> >
> Sure. I�m not so familiar with the OpenCL infra in the LLVM currently,
> so can you elaborate on any major overlap that OpenCL and HIPSPV have?
> We are planning to start submitting patches for review, perhaps it�s
> easier to point the overlaps on per-patch basis in the review system
> then?
>
> > Cheers,
> > Anastasia
> >
> >
> > ________________________________
> > From: llvm-dev <llvm-dev-bounces@lists.llvm.org> on behalf of Henry Linjam�ki via llvm-dev <llvm-dev@lists.llvm.org>
> > Sent: 09 August 2021 07:57
> > To: cfe-dev@lists.llvm.org <cfe-dev@lists.llvm.org>
> > Cc: llvm-dev@lists.llvm.org <llvm-dev@lists.llvm.org>; yaxun.liu@amd.com <yaxun.liu@amd.com>
> > Subject: [llvm-dev] [RFC][HIPSPV] Emitting HIP device code as SPIR-V
> >
> > Hi all,
> >
> > HIP is a C++ Runtime API and kernel language that allows developers to
> > create portable applications for AMD and NVIDIA GPUs from a single
> > source code [0]. There are also projects for running HIP code on Intel
> > GPU platforms via the Intel Level Zero API [1] called HIPLZ [3] and
> > HIPCL [2], which runs HIP programs in OpenCL devices with certain
> > advanced features supported. Both of these backends consume SPIR-V
> > binaries.
> >
> > We are proposing a patch set to be upstreamed that enables SPIR-V
> > emission through the HIP code path. The end goal of the patches to be
> > submitted is to emit SPIR-V binaries from HIP device code so it can be
> > embedded into executables for OpenCL-like environments (at least for
> > starters). Our current focus is on the two above-mentioned projects,
> > HIPCL and HIPLZ which are both work-in-progress HIP
> > implementations. They itself do not consume SPIR-V, but the device
> > binaries are handed over to the OpenCL and Intel Level Zero APIs,
> > respectively.
> >
> > Coarsely, the current process of translating the HIP code to SPIR-V in
> > LLVM/Clang involves:
> >
> > * Retargeting HIP device code generation to the SPIR-V target.
> > * Mapping address spaces in HIP to corresponding ones in SPIR-V.
> > * Expanding HIP features, which can not be directly modeled in SPIR-V
> > (e.g. dynamic shared memory).
> >
> > The HIPSPIRV experimental branch is available at [4]. Note that it is
> > not yet in a state we intend to propose for upstreaming, but shaping
> > up the patches is a work in progress. Before proceeding to shape up
> > and submit the patches, we would like to get feedback for the plans we
> > have for upstreaming. In the following sections, we open up the above
> > points further and sketch our plans for changes to LLVM (mostly to the
> > Clang tool) to achieve the goal.
> >
> > Retargeting device codegen
> > ==========================
> >
> > For making the HIP toolchain to emit and embed SPIR-V we are
> > tentatively planning the following changes to the LLVM/Clang:
> >
> > * Introduce, at minimum, a 'spirv64' architecture type in Triple. This
> > is what the SPIR-V backend [5] (SPIR-V BE) effort is planning to
> > upstream. We would like to upstream this change in advance to
> > specify the HIP SPIR-V device code target, potentially before the
> > SPIR-V BE work lands.
> >
> > * Implement a new SPIRVTargetInfo and fill it with necessary
> > information. For HIPCL/-LZ we are planning to adjust the address
> > space mapping in a way which is discussed later in the �address
> > space mapping� section.
> >
> > * Introduce a clang option to override the HIP device code target. We
> > are interested in the option �--offload=<target>� discussed in the
> > 'Unified offload option for CUDA/HIP/OpenMP'-thread [6]. This option
> > would suit this use case well. As far as we know, the subject has
> > not advanced further from the discussion - is anyone working on it?
> >
> > * Compilation driver:
> >
> > HIP offload builder is changed to retrieve the offload device target
> > from the --offload option. If it is not present, it can fall back to
> > AMD's default target for avoiding changing the current default HIP
> > compilation behavior.
> >
> > Temporarily change Driver to force clang to emit LLVM bitcode for
> > SPIR-V targets in the backend compilation phase. Otherwise, the
> > compilation will fail due to the lack of the real SPIR-V BE in many
> > parts of the code. Reworked HIPToolChain takes care of translating
> > the bitcode to SPIR-V during the linking phase. When the SPIR-V BE
> > lands in LLVM, we can revert this change.
> >
> > * Introduce �hipspv� as an OS or environment type in Triple. The
> > primary and the current use of the type is to select device offload
> > toolchain for HIPCL/-LZ.
> >
> > * Implement a new toolchain class 'HIPSPVToolChain' in clang which is
> > selected when the HIP device target is specified to be
> > �spirv64-unknown-hipspv� with the --offload option. Since the SPIR-V
> > BE might not land in LLVM soon, we will set up the compilation flow
> > to produce the SPIR-V binary by using the LLVM-SPIR-V translator [7]
> > which is used in our experimental branch.
> >
> > One important thing the toolchain does is to run one or several LLVM
> > IR passes, which are needed by the HIPCL/LZ runtime, on the final
> > fully linked device bitcode. The passes are required to be run
> > during link time - all user specified device code and HIPCL/LZ
> > device library routines have to be visible when the passes are
> > run. The reason for the requirement is explained in the 'HIP code
> > expansion' section. HIPSPVToolChain will use the opt tool for
> > running the passes at link time.
> >
> > * Currently, HIPToolChain is derived from ROCmToolchain and its long
> > chain of super classes (AMDGPUToolChain, Generic_ELF and
> > Generic_GCC). The new upstreamed target would not logically belong
> > under the AMDGPU/ROCm family so it does not make sense to derive the
> > HIPCL toolchain from the HIP toolchain. Therefore, we propose to:
> >
> > - Create a new base HIP tool chain, 'BaseHIPToolChain' or just
> > 'HIPToolChain', derived directly from ToolChain and put any
> > HIP-related code that is common or that can be reused in the
> > derived toolchains there.
> >
> > - Derive a new HIPSPVToolChain from HIPToolChain.
> >
> > - Rebase the HIPToolChain under the HIPToolChain and rename it to
> > HIPAMDToolChain. Since the current HIPToolChain depends on methods
> > in the super classes (e.g. AMDGPUToolChain�s getParsedTargetID)
> > the rebased class is planned to be a proxy class to avoid code
> > duplication and to reduce the amount of changes. Another option to
> > refactor the current HIPToolChain would be to use multiple
> > heritance but that leads to dreaded diamond class structure which
> > probably is not a great choice.
> >
> > With the current plan, HIPToolChain is not going to have much code
> > to be shared with the derived classes - so far only a bit of the
> > �fat binary� construction code is in sight for sharing, so the
> > immediate gains for the effort seems small. However, The TC�s layout
> > is more logical and it may spark more HIP implementations, as well
> > as help refactoring when going forward.
> >
> >
> > Address space mapping
> > =====================
> >
> > Translating HIP device code to valid SPIR-V binary requires tweaks on
> > pointers:
> >
> > Pointers without address space (AS) qualification in HIP programs are
> > considered �flat� pointers - they can point to function local,
> > __device__, __shared__ and __constant__ memory space dynamically,
> > which matches the idea of �generic� pointers introduced in OpenCL
> > 2.0. Therefore, the logical choice for the flat pointers is to map
> > them to generic pointers of SPIR-V�s OpenCL environment. HIPCL�s and
> > HIPLZ�s SPIR-V environment mandates that the kernel pointer parameters
> > must point to __global, __local or __constant memory (these are named
> > differently in SPIR-V; using OpenCL names as they are more
> > familiar). So HIP pointer parameters in the HIP kernel (__global__)
> > functions would be mapped to global pointers. Otherwise, HIP pointers
> > with AS qualifiers are mapped to SPIR-V equivalent, if suitable.
> >
> > Now, there are significant differences between HIP�s __constant__ and
> > SPIR-V/OpenCL�s constant address space:
> >
> > * In HIP, __constant__ globals can be altered on the host side with
> > the hipMemcpyToSymbol() API function. In the OpenCL�s host API you
> > cannot do this.
> >
> > (Side-note: OpenCL host API does not have an equivalent method for
> > hipMemcpyToSymbol but HIPCL currently supports hipMemcpyToSymbol for
> > the global __global variables via Intel�s
> > clGetDeviceGlobalVariablePointerINTEL API extension, but we are
> > planning to inject shadow kernel commands that access the global
> > variables instead for portability.)
> >
> > * In HIP flat pointers can point to __constant__ memory. In OpenCL
> > this is not the case with __generic pointers, which means __constant
> > pointers cannot be casted to __generic pointers and vice versa.
> >
> > There are a couple ways to deal with constants:
> >
> > * Map __constant__ to __global space in SPIR-V. That way we can
> > generate code that works and is simple to implement. Of course, we
> > lose the optimization/placing benefits of constant memory.
> >
> > * Transform the code after clang codegen (by an LLVM pass) by
> > converting the __constant objects to kernel arguments. This covers
> > the hipMemcpyToSymbol() case. There is still the constant-to-generic
> > cast issue, so we would have to use the previous point as the
> > fallback.
> >
> > We plan to start by upstreaming the first option, and time permitting,
> > improve by implementing the second option.
> >
> > The planned changes to Clang to achieve the aforementioned AS mapping
> > are as follows:
> >
> > * Define address space mapping in the new, aforementioned
> > SPIRVTargetInfo to map CUDA address spaces (which the HIP reuses) to
> > do the mapping mentioned earlier. Default AS (0) used for the flat
> > pointers are mapped to the SPIR-V�s �generic�. We intend this
> > mapping being enabled when the language mode is HIP.
> >
> > * Change SPIRABIInfo to coerce kernel AS-unqualified pointer arguments
> > to __global ones. Pointer arguments in regular device functions
> > receive the __generic AS qualifier via the address space mapping
> > defined in SPIRVTargetInfo in the above point.
> >
> >
> > HIP code expansion
> > ==================
> >
> > There are features in HIP language which do not have direct
> > counterparts in SPIR-V�s OpenCL environment and those features need to
> > be rewritten before translation to SPIR-V (in the future, lowering to
> > SPIR-V machine code through the new BE). The non-exhaustive list of
> > features that need to be expanded includes:
> >
> > * Dynamic shared memory allocation (DSM): It is an array which is
> > declared globally in LLVM IR and its actual size determined at
> > kernel launch. OpTypeRuntimeArray in SPIR-V is the closest thing to
> > model this object, alas, it requires shader capability.
> >
> > * abort() builtin: No counterpart in SPIR-V/OpenCL.
> > (Note: the behavior is not well specified in the HIP spec
> > either. Assuming it terminates the whole grid if any work item
> > reaches it. AMD�s abort definition calls __builtin_trap).
> >
> > * printf(): OpenCL�s printf takes the format string as �__constant__
> > char*� while in HIP the format string does not have to reside in
> > constant memory.
> >
> > * Texture objects. These roughly correspond to image and sampler
> > objects of OpenCL combined. Also, texture objects carry more
> > information for the texture functions than image+sampler objects do.
> >
> > * Texture references. Same as above but these are program global
> > objects. In OpenCL, image objects cannot reside in the program
> > global space.
> >
> > HIPCL/-LZ�s solution to the DSM allocation case is that the runtime
> > allocates a shared buffer and passes it to the kernel as an additional
> > argument (which is hidden from the user). The device code is modified
> > so that the DSM object is replaced with the new kernel
> > argument. Various other cases listed will be handled similarly:
> >
> > * For the printf case we tentatively replace the printf calls with a
> > function that packs their arguments to an additional buffer passed
> > as additional kernel argument and do the printing on the host side.
> >
> > * Texture objects will be tentatively split to image and sampler
> > objects and possibly auxiliary struct to carry texture
> > settings. This means at least that the kernel parameter listing
> > needs to be rewritten for the Texture objects.
> >
> > * For the texture reference we tentatively planned replacing the
> > global texture objects also with a number of additional kernel
> > arguments.
> >
> > For this and other HIP features we need to apply LLVM IR passes to
> > perform modifications on the device code. In many cases the passes
> > should be run when the device code (as LLVM bitcode) is fully
> > linked. This is simply achieved as the HIP offload mechanism already
> > emits device code as LLVM bitcode in RDC mode (-fgpu-rdc), so during
> > linking we do receive the device code as LLVM bitcode where to apply
> > these expansions with full view of the device code.
> >
> > The current plan for implementing this is to make the HIPSPVToolChain
> > to build a linker that uses llvm-link for linking device code, opt for
> > running the IR passes needed and the external llvm-spirv tool (llc in
> > the future when the SPIR-V BE lands) for emitting the SPIR-V
> > binary. We load the passes from a path the user provides
> > via --hip-link-pass-path (name pending) or automatically from HIP
> > runtime�s installation location by using the search logic provided by
> > ROCmInstallationDetector.
> >
> > There is interest in upstreaming the HIPCL/-LZ passes from the
> > HIPCL/-LZ repositories in the future for reduced maintenance
> > burden. However, we are not attempting to upstream them initially, as
> > they are not yet completed and are subject to rapid changes. Question
> > is: Where should the passes eventually be put in within the LLVM
> > project tree? Could it be OK to add a new directory under Clang for
> > tool chain passes?
> >
> >
> > Testing
> > =======
> >
> > We will provide llvm-lit tests for our toolchain in the upstream. We
> > also want to add tests to make sure clang who will run the HIPCL/-LZ
> > runtime passes get run at device code link time. For this we need a
> > dummy pass plugin that the clang loads during the test.
> >
> > When the new LLVM SPIR-V BE work lands on LLVM, we will add SPIR-V
> > assembly checks that are relevant for HIPSPV.
> >
> >
> > References
> > ==========
> >
> > [0]: HIP Programming Guide v4.5 — ROCm 4.5.0 documentation
> > [1]: oneAPI Level Zero: 1.4.8 — Level Zero Specification documentation
> > [2]: https://github.com/cpc/hipcl
> > [3]: https://github.com/jz10/anl-gt-gpu
> > [4]: https://github.com/parmance/llvm-project/tree/hip2spirv-v5
> > [5]: https://github.com/KhronosGroup/LLVM-SPIRV-Backend
> > [6]: [cfe-dev] [RFC] Unified offloading option for CUDA/HIP/OpenMP
> > [7]: https://github.com/KhronosGroup/SPIRV-LLVM-Translator
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev@lists.llvm.org
> > llvm-dev Info Page
>
>
>
> --
> BR,
> Henry Linjam�ki
>

BR,
Henry and Pekka

BR,
Henry and Pekka

Hi,

Regarding the timeline for SPIR-V backend upstream, there is a tentative plan to land the base patches until end of this year (2021). Nevertheless, at that moment, SPIR-V backend will be in experimental state. After that, we need to keep on adding missing functionality, fix bugs etc. so that we can pass a Khronos CTS test-suite with using a SPIR-V backend (instead of LLVM-SPIRV translator). At that moment, we might claim it is a production quality and not anymore experimental. That might take next several months.

I'm wondering - with the approach You have proposed - how much code that You plan to contribute as the intermediate step will become obsolete once there is a full switch to SPIR-V backend target?

konrad

Hi Konrad,

OK, thanks for the status update. We discussed this internally, and
our current understanding is that the change from using llvm-spriv for
the SPIR-V emission requires a) to change the target (in triple) from
spir64 to spirv64 b) calling the llc that invokes the LLVM backend
(BE) instead of the llvm-spriv translator tool.

Thus, to make the transition and testing while developing the BE
easier, we could contribute a patch to add a (hidden) command line
option that switches between these modes: 1) The initial default: emit
spir64 LLVM IR and call llvm-spirv. 2) Emit spirv64 LLVM IR instead
and call a user-specified llc binary.

When the LLVM BE lands upstream and is considered at least as
useful/stable as llvm-spriv, we can then switch the default of that
option to (2) instead of (1). This should allow easy testing of the
BE-based toolchain even before it lands to the upstream repo if we
allow redefining the llc binary location to the externally built LLVM
BE. How does that sound?

BR,
Pekka and Henry

Hi Pekka,

Can you also clarify how do you plan to test the SPIR-V generation?

Will you be adding IR only tests or also SPIR-V assembly/binary tests?
If the latter one, does it mean that some LLVM test bots will need an
installation and invocation of the translator?

In general, it feels like at least we will likely be ending up with duplicate
testing up until the transition phase is finalized.

Cheers,
Anastasia

Hi Anastasia,

Indeed, only LLVM IR tests make sense for that time being and after
the BE lands, we can
expand the SPIR-V asm test suite with specific tests.

BR,
Pekka

Sorry for reviving this old topic. I would like to understand what direction this effort took in the past year.

For the context, I’m trying to generate SPIR-V for code using dynamic shared memory. In CUDA/HIP dynamic shared memory is represented as an extern global array of size 0. OpenCL/SPIRV kernel can also represent dynamic shared memory but it represented by a kernel argument which is pointer with local attribute, this pointer needs to be passed to any function that need to access the dynamic shared memory.
Therefore this requires a non trivial transformation to target SPIRV shared memory from CUDA/HIP.

@linjamaki is this something you developed and upstreamed somewhere?

Konrad, Ilia (or anybody familiar with SPIR-V backend), is this a feature that the SPIR-V backend would support in the future? It is something I could help contribute to if this is the right direction for the SPIRV backend.

CHIP-SPV loads a pass plugin [1] into the Clang which is used for lowering HIP code for SPIR-V and it includes a pass for lowering dynamic shared memory [2]. When targeting SPIR-V the HIPSPV toolchain loads a pass plugin, if present, from <HIP_PATH>/lib/libLLVMHipSpvPasses.so or <HIP_PATH>/lib/llvm/libLLVMHipSpvPasses.so where the HIP_PATH is a path given by --hip-path. Alternatively, a pass plugin may be loaded with --hipspv-pass-plugin option.

[1] chip-spv/HipPasses.cpp at main · CHIP-SPV/chip-spv · GitHub
[2] chip-spv/HipDynMem.cpp at main · CHIP-SPV/chip-spv · GitHub

1 Like