Hi folks,
The next LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella, is out: Issue #41 | LLVM GPU News.
This time I also pasted the content below, at a community request (thanks @Artem-B!).
This issue covers the period from August 20 to September 9, 2022.
LLVM GPU News: Issue #41
Authors: Jakub Kuderski, Lei Zhang, Joseph Huber
Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella.
This issue covers the period from August 20 to September 9, 2022.
We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.
Industry News and Community Events
-
Lei Zhang posted a blog post about the MLIR Vector Dialect and Patterns.
-
The Khronos group released the Mesh Shading extension for Vulkan (
VK_EXT_mesh_shader).
LLVM and Clang
Discussions
Commits
-
HLSL/DirectX-related changes:
-
Global constructors will be emitted inside the generated entry function. D123977, D132672
-
Allow LLVM optimization passes to optimize resource accesses. D131268
-
Added initial codegen for
SV_GroupIndex. D131203 -
Restricted HLSL to currently supported targets only (
dxil-*-shadermodel*). D132056 -
Added support for SPIR-V builtin functions, types, and
ExtInstselection. D123024, D132648 -
Large number of changes improving AMDGPU GFX11 support.
MLIR
Discussions
-
Diego Caballero posted an RFC on ‘Vector Masking Representation in MLIR’. The proposal includes a new op,
vector.mask, and two new interfaces:MaskableOpandMaskingOp. -
gxiaotian asked about an
gpu.all_reduceoperation among threads in a subgroup, as opposed to a work group. There are no replies at the time of writing.
Commits
-
For AMDGPU, defined
mfmaoperation. D132956 -
For AMDGPU, fixed signed/unsigned comparison for
abid/cbszcomparison. D133061 -
For NVGPU, added Support for
cp_async_zfillvia inline assembly. D132269 -
For SPIR-V, added definitions for non-uniform group ops and supported lowering
gpu.shuffleto them. D133041, D133054 -
For SPIR-V, added patterns and utility functions to help lower ops and map memory space to OpenCL. D132424, D132428
-
Introduced more folders for SPIR-V ops and handled more corner cases for lowering vector ops to SPIR-V ops. D133167, D133168, D133183
-
For
arith, added initial patterns to emulate wide integer operations with narrower integers supported by the target. D133135, D133136, DD133137
OpenMP (Target Offloading)
Discussions
-
Discussions on whether or not
libomptargetshould guarantee backwards compatibility. D133277 -
Discussed improving OpenMP device reductions, which should yield 2x-10x performance once complete.
Commits
-
OpenMP 5.2 semantics for absent mapping items was implemented, causing unmapped pointers to behave like device pointers. D133447
-
Fixed a bug causing
-fsyntax-onlycrashing with the new driver. D133161 -
Fixed a bug causing the
omp_get_wtimefunction to be optimized out. D133360 -
Fixed a bug preventing users from compiling with
assertin the device. D133594 -
Added the ability to extract offloading images from other file types. D132607
External Compilers
LLPC
- Made multiple changes advancing support for mesh shaders.