LLVM GPU News #41, September 9, 2022

Hi folks,

The next LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella, is out: Issue #41 | LLVM GPU News.
This time I also pasted the content below, at a community request (thanks @Artem-B!).

This issue covers the period from August 20 to September 9, 2022.

LLVM GPU News Logo

LLVM GPU News: Issue #41

Authors: Jakub Kuderski, Lei Zhang, Joseph Huber

Welcome to LLVM GPU News, a bi-weekly newsletter on all the GPU things under the LLVM umbrella.

This issue covers the period from August 20 to September 9, 2022.

We welcome your feedback and suggestions. Let us know if we missed anything interesting, or want us to bring attention to your (sub)project, revisions under review, or proposals. Please see the bottom of the page for details on how to submit suggestions and contribute.

Industry News and Community Events

LLVM and Clang



  • HLSL/DirectX-related changes:

  • Global constructors will be emitted inside the generated entry function. D123977, D132672

  • Allow LLVM optimization passes to optimize resource accesses. D131268

  • Added initial codegen for SV_GroupIndex. D131203

  • Restricted HLSL to currently supported targets only (dxil-*-shadermodel*). D132056

  • Added support for SPIR-V builtin functions, types, and ExtInst selection. D123024, D132648

  • Large number of changes improving AMDGPU GFX11 support.




  • For AMDGPU, defined mfma operation. D132956

  • For AMDGPU, fixed signed/unsigned comparison for abid/cbsz comparison. D133061

  • For NVGPU, added Support for cp_async_zfill via inline assembly. D132269

  • For SPIR-V, added definitions for non-uniform group ops and supported lowering gpu.shuffle to them. D133041, D133054

  • For SPIR-V, added patterns and utility functions to help lower ops and map memory space to OpenCL. D132424, D132428

  • Introduced more folders for SPIR-V ops and handled more corner cases for lowering vector ops to SPIR-V ops. D133167, D133168, D133183

  • For arith, added initial patterns to emulate wide integer operations with narrower integers supported by the target. D133135, D133136, DD133137

OpenMP (Target Offloading)


  • Discussions on whether or not libomptarget should guarantee backwards compatibility. D133277

  • Discussed improving OpenMP device reductions, which should yield 2x-10x performance once complete.


  • OpenMP 5.2 semantics for absent mapping items was implemented, causing unmapped pointers to behave like device pointers. D133447

  • Fixed a bug causing -fsyntax-only crashing with the new driver. D133161

  • Fixed a bug causing the omp_get_wtime function to be optimized out. D133360

  • Fixed a bug preventing users from compiling with assert in the device. D133594

  • Added the ability to extract offloading images from other file types. D132607

External Compilers


  • Made multiple changes advancing support for mesh shaders.