[RFC] Adding HLSL and DirectX support to Clang & LLVM

beanz · March 8, 2022, 11:05pm

Hello all,

The HLSL compiler community is interested in contributing HLSL, DirectX and Vulkan graphics support to Clang and LLVM.

Why do we want to do this?

The existing HLSL compiler, the DirectX Shader Compiler (DXC), is a fork of LLVM/Clang 3.7 and is developed open source on GitHub (https://github.com/microsoft/DirectXShaderCompiler) by Microsoft and a diverse community of open source partners. We plan to update our compiler to the latest LLVM. This plan is motivated by our efforts to bring new C++ language features and tooling improvements to HLSL.

While we could do this in our own fork, we believe that integrating our compiler and community with the LLVM community will allow us to expand both communities, and to deliver a better compiler for our users.

What is HLSL?

High Level Shading Language (HLSL) was introduced as part of DirectX 9 for programming the (then new) programmable parts of the GPU rendering pipeline. The language began as a C-like language and has developed over time to include more and more C++ features. The latest version of HLSL, HLSL 2021, has added C+±style templates, limited operator overloading, and bitfields (Announcing HLSL 2021 - DirectX Developer Blog).

How is HLSL used?

HLSL is widely used today for graphics and general purpose GPU programming. The DirectX runtimes on Windows support HLSL either as source or pre-compiled to an intermediate representation. Additionally DXC can be used to generate SPIR-V which can then be used either with the Vulkan runtime or, through SPIR-V cross, converted to GLSL or Metal Shader Language for use with their respective APIs. HLSL is also largely source compatible with several other shading languages, and carefully written C/C++, enabling complex shaders to be constructed from code that can be written in a variety of languages.

What CodeGen targets do we support?

There are three primary code generation targets that we are interested in supporting. Our first priority is supporting the code generation targets that DXC supports today: DirectX Intermediate Language (DXIL) and SPIR-V. In the future we may additionally support DirectX Bytecode (DXBC), which is the virtual ISA supported by DirectX versions 9-11.

DXIL, the intermediate representation used for DirectX 12, is based on LLVM-3.7 IR encoded as bitcode. For many reasons that have been discussed at length within the LLVM community in the past, this is not great, but it has already shipped and is the driver interface for all DirectX 12 GPU drivers.

There have been many discussions of SPIR-V in the LLVM community. SPIR-V is a virtual ISA used for GPU programming. Since SPIR-V code generation is widely used by developers writing HLSL, this is a critically important feature for us too. There are several possible ways we could support SPIR-V, and we are looking forward to actively engaging with the community to solve this problem.

How are we proposing to do this?

For several reasons, primarily the age of our current LLVM fork, we are not proposing merging our existing compiler into modern LLVM, but rather re-implementing our compiler’s functionality in LLVM/main piece by piece.

Broadly speaking this would mean adding HLSL-specific language options to the Clang frontend as well as DirectX target support both to Clang and LLVM. The Clang DirectX target implementation would resemble the CUDA or OpenCL targets, and in LLVM we would add a DirectX target to contain our codegen passes and emit DXIL. By isolating as much of the DXIL-specific code as possible into a target we hope to minimize the cost on the community to maintain our legacy bitcode writing support.

We intend to take a different approach to implementing HLSL support in Clang from what is present in our current compiler, so while we can use the current implementation as a reference for language features and a source for test files, we will not use it as a model for implementation details.

Assuming that this proposal is acceptable the first few patches are ready to start being posted for review immediately. As we move forward, we’d like to start an HLSL working group which will have regular meetings to discuss and track progress and coordinate efforts across contributors. Microsoft is making a commitment to bring Clang up to feature parity with DXC, but the HLSL community is supporting this effort and we expect to fully shift development over to LLVM/main after HLSL support becomes feature complete.

tstellar · March 8, 2022, 11:50pm

Would these be supported via an LLVM backend or would they be generated directly by clang instead of using LLVM IR ?

beanz · March 9, 2022, 12:02am

For DXIL, my thought was to add a DirectX target in LLVM with a partial backend. Since DXIL is LLVM IR, we won’t go down through the MC layer. Instead we’ll have some lowering passes to convert the IR to be LLVM-3.7-like and a modified bitcode writer to emit the DXIL. By containing this in a target we can limit the impact to the wider community for maintenance.

For SPIR-V… I’m very much interested in a wider discussion. Our current compiler emits SPIR-V from the AST, but there are some big downsides there. One of the biggest ones is limited code sharing between the SPIR-V and IR generation code paths.

One option I’ve considered for SPIR-V support is to use the Khronos SPIR-V backend similar to the SYCL flow, but the SPIR-V backend would also need some work as it doesn’t currently support SPIR-V for graphics. It would also certainly be more ideal if that tooling was in-tree.

antiagainst · March 9, 2022, 3:42am

Thanks for the RFC! It’s great to see the effort of modernizing the DXC codebase and merging upstream! Not a Clang expert, so I will let others comment on that. Just chiming in on the SPIR-V side.

I’d be very interested to explore a path to CodeGen towards SPIR-V dialect in MLIR. Given that I think the post is more for seeking high-level feedback before fleshing out concrete action plans, I won’t go into all the details. But in general, SPIR-V, as a standard, has two quite different “profiles” to accommodate the needs of graphics like Vulkan and compute like OpenCL. These two “profiles” correspond to the capabilities in the Shader and Kernel tree respectively. They have quite different approaches towards memory models, runtime shader/kernel ABI etc. For the Shader side, it’s using logical addressing mode where we have abstract pointers (no physical size, cannot store into memory by default, etc.) and generally rely on indices to access nested structures. There are quite a lot of requirements on how resources are represented (e.g., using structs with special decorations) and how they match across shader stages (vertex/geometry/fragment/etc.). For the Kernel side, it’s using physical addressing mode, where it’s more akin to LLVM’s approach towards pointers and you can do all sorts of casting. Resources and interfaces are also simpler, without the need to match multiple shader stages and interact with fixed-function stages. So to me, among the two, the Kernel side is more of a fit to go through LLVM flow. Letting the Shader side go through LLVM you might need to fight hard with LLVM transformations assuming more of the Kernel mentality, e.g., recovering structured control flow, handling convergence, threading decorations needed through various layers, etc. Some of the issues have been discussed for a long time in the community and I’m not sure we have a good answer towards them even today.

Then speaking of the current status. The SPIR-V target in LLVM only handles the Kernel side for OpenCL; and to my knowledge the upstream development just got started a few months ago. So I’d assume quite some effort needs to be put into it to make Kernel side work and stable. While for the SPIR-V dialect in MLIR, necessary native support for major SPIR-V mechanisms are basically built out, including different execution model/mode, full versioning/extension/capability support, various SPIR-V op definition and verification, binary serialization and deserialization, etc. IREE uses it to run ML models using Vulkan compute shaders and there are no problems for various models on various GPU vendor architectures. So for Vulkan compute shaders it’s largely feature complete (barring new extensions popping up regularly). Compute shader works means various things for targeting Vulkan, like ABI, etc. are in good shape; supporting other shader stages is just natural further developments. The SPIR-V dialect is also used by Intel folks (@Hardcode84) to compile ML models into Kernel-flavored SPIR-V and run on Intel L0 API, too. So not limited to Shader; the setup is meant to be accommodating different needs like SPIR-V itself. CodeGen’ing towards the SPIR-V dialect would mean we can directly pick up all the efforts there.

Hope this helps. Happy to chat more about details. FWIW, I wrote a large portion of the SPIR-V backend in the existing DXC repo.

Thanks,
Lei

beanz · March 9, 2022, 4:01am

Hi @antiagainst! Thank you for commenting. I think you bring up a lot of really great points.

There are definitely challenges with the SPIR-V backend, and I do think it would be a lot of work, but it would also potentially allow sharing code between the DXIL and SPIR-V code generators more so than our current architecture allows.

I think MLIR is really compelling too, and using MLIR for SPIR-V generation is also an option. That would require adding MLIR code generation support to Clang, and graphics support to the dialect. I know there has been some experimentation in this area, but there is nothing upstream yet. This would also be a lot of work.

I don’t really think there’s an “easy” answer here.

This is partially a question for Clang maintainers: should Clang grow an MLIR code generator, or should it remain only LLVM IR based?

Some of the concerns you mention about flowing through LLVM IR are problems we (and other GPU compilers) already have to deal with, like recovering structured control flow, and resource annotations. We need that for DXIL so we will be bringing control flow structuring passes with us already.

I’m not afraid of doing a lot of work (otherwise I wouldn’t be undertaking this project), but I do want to make sure that the work we do is in line with the community direction and the best technical solution the many minds here can imagine.

efriedma-quic · March 9, 2022, 6:44am

There was some informal discussion in the past of making clang do code generation through MLIR. clang’s “codegen” is really doing too many things at once, so adding an extra stage between the AST and LLVM IR would have some benefits. Among other things, it would let us share code between clang’s “CodeGen”, and the static analysis “CFG”. MLIR would probably be an effective framework for implementing such a “clang IR” layer between the clang AST and LLVM IR; it’s easily extensible, and it already has an LLVM IR dialect. But nobody has really looked at it seriously, as far as I know; it’s a really big project, and the benefits are sort of speculative.

That said, even if we had that, it’s not clear to me that you’d want to use it directly for SPIR-V or DXIL code generation. We probably wouldn’t run many optimizations on the “clang IR” dialect, so you’d end up with almost completely unoptimized code, I think.

troughton · March 9, 2022, 7:48am

This is an exciting project! Without weighing in on the technical pro/cons of implementing at different layers since I’m unqualified to do so, I’ll note that I’m aware of efforts to write shaders in Rust (via rust-gpu), and I’d personally be quite interested in a similar feature for Swift. It would certainly be positive if efforts here helped enable other LLVM-targeting languages to more easily emit DXIL or SPIR-V, even if it’s (understandably) not a primary goal.

Jasper-Bekkers · March 9, 2022, 12:26pm

We started rust-gpu out of a desire to modernize shading languages, at least within the Rust ecosystem, and would’ve loved to use an llvm available backend for it. Back in the early conception phase of the project, we even evaluated targeting dxil instead (I did some experiments to see if we could target the outdated llvm version used by DXC as a rust backend at the time).

At the time I had also evaluated MLIR, which wasn’t mature enough for our use, and seemed to mostly be focussed on ML workloads instead of our graphics workloads. We looked at the Khronos provided LLVM SPIR-V backend, but that only targeted Kernel mode (as @antiagainst explained), which is not supported for the graphics shaders we wanted to write.

Over the last few years, rust-gpu has grown to include it’s own structurizer, and it’s own workarounds for some of the SPIR-V quirks.

I’ve been wanting, and looking for something in the LLVM ecosystem to effectively target shader development (not CUDA / OpenCL style workloads), and was quite hopeful when DXC came out with their switch to LLVM. Unfortunately it ended up getting stuck on 3.7, so I very much welcome additions in this space. Over time and because it was so easy, my team have done smaller and larger contributions though the GitHub PRs; for example implementing a large part of the Linux support required to build DXC as a .so.

I think this change would be extremely welcome to the community, and extremely useful in a broader sense. However, if this is being done I do feel I should point out a few things that I think would make this a success (feel free to disagree).

“Regular” graphics oriented shaders should be a primary focus
I feel like SPIR-V itself, and potentially DXIL as well, should evolve with LLVM to make this a success, potentially requiring efforts within Khronos to drop some of the design quirks of SPIR-V in favor of something more suitable to LLVM
I would love this to be in LLVM mainline proper

In the past for similar proposals, I’ve seem some arguments that “LLVM doesn’t target ILs”, however, I think there is a massive community value here, and I think that over time it’s been proven that the business value and use-cases are there. As well as large corporations willing to do the legwork to do this The Right Way. So instead of discussing, live we have in the past, about why we shouldn’t do this, I think it might be more useful to discuss what effectively would need to get done to support this properly within LLVM.

clattner · March 10, 2022, 5:51am

I’m really excited to see this proposal Chris @beanz, I agree it will help pull the two communities together and make both stronger.

On the “Clang generating MLIR” comments above, one additional benefit of generating MLIR instead of LLVM for graphics applications in particular is that you’re presumably maintain structured control flow through the entire compilation flow - you don’t want to lower to a CFG and deal with the various problems that come with that.

That said, I agree that it would be a lot of work. It seems fairly orthogonal to the clang frontend improvements and other work entailed by this proposal,

-Chris

nhaehnle · March 10, 2022, 12:08pm

This might be a good time to have a larger discussion about how LLVM can better support GPU use cases.

As people have pointed out, MLIR has a lot of expressive power that could, in theory, be used for GPU-specific constructs like resource annotations. At the same time, GPU middle end / backend compilers (as opposed to ML-focused frontend compilers) generally want to run a typical LLVM IR pass pipeline while preserving some of those constructs for at least parts of the pipeline.

It is a challenge for our shader compiler (LLPC), presumably others that are LLVM-based but not open source, and likely for this HLSL effort, that MLIR is a different “substrate” (uses different C++ classes for representing values etc.) as LLVM IR, and so we have to choose between the richness of existing optimizations vs. the representation richness of MLIR. As @efriedma-quic hinted at, using the richness of existing optimizations tends to win this pragmatic tradeoff.

A while ago, I’ve started to explore writing a library that will give us at least some of the benefits of dialects (in terms of programmer productivity) on top of the LLVM IR substrate. This works quite well even as an external library, though it could work even better if we integrated it with core LLVM, so that e.g. custom operations like DXIL’s @dx.op.* can be implemented more efficiently.

Traditional LLVM patterns like intrinsics would also benefit by getting auto-generated convenience access classes (use methods with descriptive names instead of intrinsic->getArgOperand(magic_number)!).

tschuett · March 10, 2022, 12:58pm

I am really excited about this proposal. I would love to see a separate discussion about clang generating MLIR.

beanz · March 10, 2022, 5:25pm

I think this is already starting. For years now uses of LLVM for GPU applications has been growing. @jdoerfert’s recent GPU working group is a great example of the gaining traction. There’s still a lot we can do to improve things, but progress is being made.

This is very interesting to me. One of the things on my todo list is to move DXIL specifications into TableGen (currently they are driven by python scripts). Extending TableGen generation to improve the usability of intrinsics is definitely something to look into in the process.

beanz · March 10, 2022, 5:27pm

Thank you everyone for the feedback!

I think we should continue having conversations about how to best support SPIR-V code generation and the future of MLIR in Clang. To move this proposal in a direction of concrete actions I’ve pushed a branch where I’ve been experimenting with some of the implementation details for how we’d like to move forward. I rebased the branch yesterday on main@1b3fd28c6ecc.

The branch contains two commits which I’ll break up further before posting for actual review.

The first commit is the LLVM changes and includes:

An LLVM triple architecture for dxil
An LLVM triple “operating system” for shadermodel (shader models are versioned ABI interfaces for shader programs)
LLVM triple “environment” for shader stages (pixel, vertex, etc)
A modified bitcode writer to emit 3.7-like IR
An experimental DirectX target which wraps DXIL passes and emitting
Some crazy CMake to drive optional testing bitcode compatibility (still can’t escape CMake…)

Before posting any of this code for review, I plan to refactor the BitWriter library so that alternate IR serializations can be supported in other libraries. This will allow our modified bitcode writer to live inside the DirectX target directory and not pollute the rest of LLVM.

The second commit is the first set of clang changes and includes:

Added a language mode for HLSL
Initial driver support for HLSL some HLSL options
Expanded support for parsing Microsoft attribute syntax (used in HLSL)
Support for parsing HLSL Semantic attribute syntax

Assuming there are no objections to this, I’ll start posting patches in the next few days.

efriedma-quic · March 10, 2022, 8:35pm

See the patch series starting with ⚙ D115009 [SPIRV 1/6] Add stub for SPIRV backend for an example of how to post a patch series for a new backend.

A backend that doesn’t use SelectionDAG/GlobalISel is likely going to be rejected, similar to what happened for SPIR-V. See [llvm-dev] [RFC] Upstreaming a proper SPIR-V backend and the threads it refers to.

The frontend changes look uncontroversial.

beanz · March 10, 2022, 8:39pm

I very much remember many iterations of that conversation. Using an instruction selector to select LLVM IR instructions seems… odd.

DXIL is just LLVM IR using an old bitcode writer. Wrapping this in a “backend” is really just a way to minimize the burden to the wider community.

Worth noting: if we support DXBC in the future, I fully expect to use GlobalIsel for that.

dneto · March 11, 2022, 12:08am

First, I’m excited to see this happen. It’s great for the graphics ecosystem.

If anybody still needs convincing, I want to confirm:

We (Google) contributed and maintain the SPIR-V paths in DXC, going back 5 years. It’s been a great and productive open source collaboration.
DXC is the production HLSL compiler for Stadia. So, we care about the long term health of the HLSL language and its compilers.

I second everything @antiagainst said. Let me take a step back to talk about the overall architecture of the SPIR-V path.

The SPIR-V path in DXC does not use LLVM IR at all. Instead the flow is:

Clang AST →
Custom SPIR-V-focused representation →
“Shader” dialect SPIR-V for Vulkan, but allowing some illegalities,
Then a bunch of SPIR-V-to-SPIR-V transforms to perform “legalization”
Valid Vulkan-flavoured SPIR-V.

The details of the “legalization” is off topic here, but deals with aspects of de facto HLSL shaders that massively break Vulkan conventions. For more, see “Vulkan HLSL There and Back Again” from GDC 2018. Slides and video at 2018 GDC - The Khronos Group Inc
See also DirectXShaderCompiler/SPIRV-Cookbook.rst at main · microsoft/DirectXShaderCompiler · GitHub for examples of what kinds of constructs are handled.

We built the SPIR-V path this way because:

It predated MLIR
We knew very well the struggles of handling GPU code through LLVM transforms. It requires great care and constant vigilance as LLVM’s transforms evolve over time.
We had in-house expertise and a good start on the “spirv-opt” stack in SPIRV-Tools (in collaboration especially with LunarG). This was critical for building out the legalization heuristics as required for handling our production workloads.

In retrospect

avoiding LLVM IR worked out well.
the SPIR-V focused intermediate is in the same spirit as other language-focused intermediates such as Swift IL.

It would be quite pragmatic for the SPIR-V path to repeat the pattern again in this new initiative.
I don’t know how that would sit with Clang and LLVM maintainers though.

Things to think about:

Does this entail taking on SPIRV-Tools as a dependency? That could be unattractive.
Cut the compiler path before the SPIRV-Tools dependency? Then you get “illegal-for-Vulkan” SPIR-V, which would need post-processing. That’s kind of unpalatable.

I could see leveraging the SPIR-V dialect in MLIR, per @antiagainst’s suggestion. I’m too far away from the details to be a good judge of the tradeoffs. I completely defer to him on it.

Again, overall this is a great step. We look forward to seeing how we can help, both on design and implementation.

cheers,
david

nhaehnle · March 11, 2022, 1:28pm

I agree with this. Just because we have this round hole of SelectionDAG and GlobalISel doesn’t mean we should hammer a square peg into it.

There are some mismatches that the backend will have to work out. One of them that comes to mind is that DXIL has typed pointers and LLVM IR doesn’t anymore. However, I don’t see how going to MachineIR helps with that. Its type system is even further removed from DXIL.

We should also consider the broader ecosystem implications. Consumers of DXIL (e.g., our closed source compiler for DirectX, and I imagine this applies to others as well) work with DXIL directly on the LLVM IR “substrate”, even when they’re based on more recent versions of LLVM. Having a backend on MachineIR as part of the compilation pipeline makes it harder for people to move across the stack. Using SelectionDAG as well would make this even worse by adding yet another IR.

beanz · March 11, 2022, 4:09pm

I could not have said this better myself.

For people who aren’t familiar with DXIL or its uses, it might be worth elaborating a little here.

DXIL is LLVM 3.7 IR as bitcode with a wide set of constraints on how the IR is structured. One of the key features of DXIL is that it is readable by backends that are LLVM and backends that are not.

Nothing in DXIL can’t be represented in LLVM IR, but some things in DXIL are represented in unusual ways to make DXIL easier to parse by non-LLVM compilers. For example, DXIL operations, which behave a lot like intrinsics, are actually IR functions, and each operation function takes a unique constant integer as the first parameter which serves as an identifying opcode. This allows a DXIL reader to bypass function name matching and not have full support for bitcode abbreviations.

Because of the complexity of the constraints on DXIL, one of the tools included in our toolchain is a DXIL validator. We intend to use the old bitcode reader and validator in our testing to verify the new bitcode path, but we also intend to write a new DXIL validator in LLVM with this effort.

I proposed adding this as a backend in an effort to isolate the code required for generating DXIL from the rest of LLVM. That code will include a slew of IR passes to transform LLVM IR that comes out of Clang CodeGen and through the normal IR optimization passes into DXIL as well as the modified bitcode writer needed to emit DXIL. Additionally we will utilize many of the features of the target IR layer like target intrinsics, data layout, etc.

+1

Re-materializing LLVM IR bitcode from MachineIR would be a lot of work for no technical benefit. In fact, it would likely make maintaining DXIL significantly more difficult.

antiagainst · March 11, 2022, 9:37pm

Agreed that we should weigh different trade offs and see how to best approach SPIR-V support to be aligned with long-term overall community directions. This is certainly no small effort; for it to unfold and eventually fully land, it takes quite some years I guess.

What is nice about MLIR is it gives us a modular approach. Fundamentally SPIR-V is meant to be at a level similar to (if not higher than) LLVM. It has its own roadmap and design choices and will remain so. Trying to get all these different design perspectives (which graphics contributes and needs lots of them) faithfully represented in LLVM and make sure existing transformations respect them on an ongoing basis is, I feel, a huge effort. Finding technical solutions needs to additionally balancing other LLVM use cases, therefore also more constraining factors because things are all bundled together this way.

OTOH, MLIR provides nice IR infra that allows us to choose the most natural way to represent and transform per the domain specifically. Because it’s unbundled and only needs to consider domain needs, it’s also simpler and easier to maintain and evolve in the long run. @dneto raises great questions about legalization. Major work there is trying to trace resource usages to only one definition, so need transformations like inlining, SROA, DCE, canonicalization, etc. These passes, if not existing, won’t be too hard to write with MLIR nowadays.

Though yes this inevitably touches the bigger question over how the Clang community thinks about having MLIR emitters. I’m interested to hear more how the community thinks. IMHO, getting started with SPIR-V is actually a unique position, given the detached nature of SPIR-V from LLVM. It’s a much hard lifting to migrate Clang LLVM to MLIR; SPIR-V can be a way to enable the integration first and then we can push towards that gradually too.

Thanks for the RFC again. Either way we go, this is quite exciting!

Flakebi · March 14, 2022, 8:16am

I wonder if you have any plans on how DXIL can be read with opaque pointers? (there was a bit of discussion in this post)
I think, e.g. the type of a raytracing payload struct would be unknown when using opaque pointers (if it is unused and only appears as a pointer argument).

Topic		Replies	Views
[RfC] A proposal of adding SPIR-V Toolchain in Clang LLVM Dev List Archives	28	150	October 30, 2018
[RFC] Upstreaming a proper SPIR-V backend LLVM Dev List Archives	33	697	March 14, 2021
[SPIR-V] SPIR-V in LLVM LLVM Dev List Archives	45	352	January 9, 2018
[GSoC] "Microsoft Direct3D shader bytecode backend" proposal LLVM Dev List Archives	5	106	April 8, 2011
GLSL to SPIR-V dialect frontend project, still open? MLIR	31	2889	January 12, 2021