We (Intel) would like to request to add SYCL programming model support to LLVM/Clang project to facilitate collaboration on C++ single-source heterogeneous programming for accelerators like GPU, FPGA, DSP, etc. from different hardware and software vendors. SYCL programming model is described in detail in the specification document available at the Khronos site: https://www.khronos.org/registry/SYCL/specs/sycl-1.2.1.pdf.
I’m going to start sending patches to the clang project with the basic functionalities (including a new command line option to enable SYCL programming model) and RFCs for features requiring design review with clang community (e.g. the interface or protocol between the device compiler and the runtime library).
I’m looking for suggestions on what is the best way to proceed with this proposal. I would appreciate any feedback.
Here is short list of features we would like to contribute first:
· SYCL compiler
o Adding device compiler diagnostics (this should almost 100% overlap with OpenCL C++ compiler diagnostics)
o Functionality to separate SYCL device code out from the single source
o Address-space handling (including address space inference/deduction)
o Functionality to translate SYCL device code (C++) to SPIR-V format
o Adding two attributes to mark SYCL kernel functions (can be invoked from the host) and SYCL device functions (available on the device)
o Functionality to emit the “integration header” by SYCL device compiler with the device specific information for SYCL runtime library, which is used to launch SYCL kernels on the OpenCL device.
o SYCL compiler driver
· Implementation of two compilation modes: device-only and two-step compilation
· Functionality to support device code compilation and linking from multiple translation units
· Enhancing the driver with clang-offload-wrapper tool and corresponding job to support “fat objects” (the device code and the host code bundled together).
· Adding SYCL toolchain support including llvm-spirv and offload-wrapper tools.
· Contributing SYCL runtime library under LLVM projects.
o SYCL C++ Template Library: the template library provides a set of C++ templates and classes which provide the programming model to the user. It enables the creation of runtime classes such as SYCL queues, buffers and images, as well as access to some underlying OpenCL runtime object, such as contexts, platforms, devices and program objects.
o SYCL runtime: The SYCL runtime interfaces with the underlying OpenCL implementations and handles scheduling of commands in queues, moving of data between host and devices, manages contexts, programs, kernel compilation and memory management.
o The SYCL system assumes the existence of one or more OpenCL implementations available on the host machine. If no OpenCL implementation is available, then the SYCL implementation provides only the SYCL host device to run kernels on.
Almost all compiler modifications are supposed to be made in the clang project and SYCL runtime library (located in “/sycl” directory). The only change planned in LLVM project so far is new environment component in the triple.
What is SYCL
SYCL (pronounced ‘sickle’) is a royalty-free, cross-platform abstraction layer that builds on the underlying concepts, portability and efficiency of OpenCL that enables code for heterogeneous processors to be written in a “single-source” style using completely standard C++ language. SYCL single-source programming enables the host and kernel code for an application to be contained in the same source file, in a type-safe way and with the simplicity of a cross-platform asynchronous task graph. SYCL includes templates and generic lambda functions to enable higher-level application software to be cleanly coded with optimized acceleration of kernel code across the extensive range of shipping OpenCL implementations.
High level overview of Intel’s SYCL implementation
Intel’s SYCL implementation consists of two major components: SYCL compiler and runtime library. Although SYCL is designed as “extension-free” standard C++ API, there is a need to have some “compiler” extensions to enable C++ code execution on accelerators (e.g. special attributes to mark “accelerated” functions).
SYCL compiler is responsible for “extracting” device part of code and compiling it to SPIR-V format or device native binary. SPIR-V (Standard Portable Intermediate Representation) is a standard form of the code for OpenCLTM offload API. In addition, the compiler also emits auxiliary information, which is used by the SYCL runtime to run the device code on the accelerator via OpenCLTM API.
SYCL runtime library API is a C++ abstraction layer on top of the OpenCLTM API which enables execution of C++ SYCL code on accelerators like FPGA or GPU.
We are working on making Intel’s implementation sources available at GitHub (hopefully next week). Our implementation is not complete, but we would like to start collaboration with the community interested in heterogeneous programming as early as possible to improve the quality of the implementation through design and code review process.
Available SYCL resources
https://www.khronos.org/sycl/ - SYCL page at Khronos Group site.
http://sycl.tech/ - SYCL ecosystem site (supported by Codeplay). There is a list of project implemented using SYCL programming model (e.g. Tensorflow SYCL back-end, machine learning and linear algebra libraries, etc.)
https://github.com/KhronosGroup/SyclParallelSTL - Parallel STL implementation based on SYCL
https://github.com/triSYCL/triSYCL - open source SYCL implementation driven by Xilinx
https://www.codeplay.com/products/computesuite/computecpp - closed source SYCL implementation from Codeplay