Summary
This RFC proposes the addition of SYCLBIN, a new binary format for storing SYCL device code. The format provides a lightweight, extensible wrapper around device modules and their corresponding SYCL-specific metadata to be produced/consumed by tools/SYCL runtime.
Purpose of this RFC
This RFC seeks community feedback on the proposed SYCLBIN binary format, including toolchain integration approach. Community input is particularly valuable regarding potential integration challenges with existing LLVM offloading implementations.
Motivation and Alternatives Considered
Requirements: SYCL-Specific Metadata and Modules Hierarchy
- The SYCL programming model requires device images to be accompanied by specific metadata necessary for SYCL runtime operation:
- Device target triple.
- Compiler and linker options for JIT compilation scenarios.
- List of entry points exposed by each image.
- Property sets.
- When a binary contains multiple images, some may share common metadata. Therefore, we require a hierarchical structure that enables metadata sharing while allowing specification of image-specific metadata.
- Multiple images can exist for a single device, and images for different devices can expose different entry points.
Existing Formats Limitations
LLVM’s offloading infrastructure supports several binary formats that can be embedded within the Offload Binary format. However, these formats have various limitations that make them unsuitable for SYCL:
- Single-Module Design: Formats like Object, Bitcode, CUBIN, PTX, and SPIRV are designed for single binary or single-module IR representation, lacking hierarchical structuring capabilities for multiple images/modules.
- Missing SYCL Metadata Support: None provide native support for SYCL-specific metadata requirements.
- Vendor Constraints: Fatbinary is NVIDIA proprietary and incompatible with SYCL’s vendor-neutral approach.
- Limited Container Capabilities: Offload Binary is not designed for multiple device images or hierarchical organization, with
StringData
insufficient for complex metadata structures (like #1.3 and #1.4 above).
The OffloadingDesign describes a target binary descriptor that stores multiple binary images (one per device type) with all images sharing the same entries list. This structure doesn’t satisfy requirement #3 above.
Abstraction: Simplifying Support in Offloading Tools
Another motivation for adding the SYCLBIN format is to encapsulate SYCL-specific logic within SYCL-specific toolchain components (clang-sycl-linker, SYCL runtime) and isolate SYCL implementation details from general offloading tools designed to support multiple programming models.
Current Workflow without SYCLBIN
Without this format, metadata transfer from compiler to runtime requires the following complicated workflow:
- clang-sycl-linker uses Offload Binary’s
StringData
(with workarounds) to store metadata (#1 above). - clang-linker-wrapper opens Offload Binary files produced by clang-sycl-linker and generates device image binary descriptors in a format readable by SYCL runtime.
- Problem: This requires clang-linker-wrapper to maintain SYCL-specific format knowledge, creating unnecessary code duplication.
- SYCL runtime decodes metadata using this intermediate format.
Simplified Workflow with SYCLBIN
The SYCLBIN format enables a cleaner separation of concerns:
- clang-sycl-linker prepares a complete SYCLBIN containing all metadata and multiple images, embedding it as a single image within Offload Binary.
- clang-linker-wrapper generates only host register/unregister calls and a trivial wrapper without needing knowledge of SYCLBIN internals.
- SYCL runtime works directly with SYCLBIN format.
This approach eliminates the need for clang-linker-wrapper to understand SYCL-specific formats, reducing maintenance burden and improving toolchain modularity.
Enable Modular Dynamic Loading of Device Binaries at Runtime
Some applications require dynamic loading of device binaries at runtime to achieve modularity and avoid recompiling the entire application when device code changes. The SYCLBIN format provides a standardized interface between compiler-produced SYCLBIN binaries and runtime handling, enabling efficient dynamic loading scenarios.
SYCLBIN serves as SYCL’s analog to CUDA’s FATBIN format. Just as nvcc provides compiler options to generate “.fatbin” files, SYCL compiler could offer options to generate “.syclbin” files. Similarly, we intend to add SYCL runtime functions to load and manipulate “.syclbin” files, mirroring CUDA’s runtime functions for “.fatbin” files.
Design
SYCLBIN Binary Format
The SYCLBIN format consists of:
- A file header with magic number and version information.
- Three lists of headers: the abstract module header list, the IR module header list and native device code image header list,
containing information about the abstract modules, IR modules and native device code images respectively. - Two-byte tables containing metadata and binary data.
File Structure
File header |
Abstract module header 1 |
… |
Abstract module header N |
IR module header 1 |
… |
IR module header M |
Native device code image header 1 |
… |
Native device code image header L |
Metadata byte table |
Binary byte table |
Key Components
Abstract Modules: collection of device binaries that share properties, including, but not limited to: exported symbols, aspect requirements, and specialization constants. The device binaries contained inside an abstract module must either be an IR module or a native device code image. IR modules contain device binaries in some known intermediate representation, such as SPIR-V, while the native device code images can be an architecture-specific binary format. There is no requirement that all device binaries in an abstract module are usable on the same device or are specific to a single vendor.
IR Modules: metadata and binary data for the corresponding module compiled to a given IR representation.
Native Device Code Images: metadata and binary data for the corresponding module AOT compiled for a specific device.
Toolchain Integration
The SYCLBIN content can either be embedded as an image within the Offload Binary produced by the clang-sycl-linker or outputted directly as standalone SYCLBIN files.
This integration approach allows SYCLBIN to leverage the existing Offload Binary infrastructure while maintaining its specialized format for SYCL-specific requirements.
clang-sycl-linker Changes
The clang-sycl-linker is responsible for module-splitting, metadata extraction (symbol tables, property sets, etc.) and linking of device binaries. To support SYCLBIN, it must be able to:
- Pack device binaries and extracted metadata into the SYCLBIN format.
- Embed the resulting SYCLBIN into an Offload Binary container or output standalone SYCLBIN files.
- Support linking multiple SYCLBIN binaries together.
clang-linker-wrapper Changes
The clang-linker-wrapper shall support two operational modes:
- Standalone SYCLBIN Output: Output SYCLBIN binaries directly, skipping device code wrapping and host code linking stages.
- Host Linking: Generate host register/unregister calls and a trivial wrapper for SYCL runtime access to SYCLBIN binaries and link with host code.
SYCL Runtime Library Changes
The runtime must be able to parse the SYCLBIN format, using the implementation of SYCLBIN
reading and writing functionality.
Versioning and Extensibility
The SYCLBIN format is subject to change, but any such changes must come with an
increment to the version number in the header.
Additionally, any changes to the property set structure that affects the way the
runtime has to parse the contained property sets will require an increase in the
SYCLBIN version. Adding new property set names or new predefined properties only
require a SYCLBIN version change if the SYCLBIN consumer cannot safely
ignore the property.
Upstreaming Plan
- Phase 1: Upstream SYCLBIN format specification, including parsing/writing.
- Phase 2: Add clang driver, clang-sycl-linker and clang-linker-wrapper support.
- Phase 3: Integrate SYCLBIN support into SYCL runtime.