SPIR-V module combiner


I am a 2nd year student and I want to contribute to MLIR project - SPIR-V module combiner. I am wondering how can I find out more about the aims of the project and where I can get started.



Hey George, it’s super great to know that you are interested in SPIR-V module combiner! I added that so I’m very happy to act as the mentor. Some quick questions first: are you only particularly interested in this project or actually are you okay with other SPIR-V related projects? How familiar are you with SPIR-V at the moment?

Hi, I am okay with other SPIR-V related projects (What kind of projects are these by the way?). At the moment I only had a quick look at SPIR-V documentation, however I am familiar with compilers. etc. having done a coursework in my university - creating a compiler in Python from WACC language to ARM Assembly.

Awesome! Familiarity with compilers and browsing through the SPIR-V doc is enough to get started. I can explain SPIR-V concepts and details as it goes.

For the SPIR-V module combiner, the purpose is to leverage SPIR-V’s features (like specialization and multiple entry points) to alleviate the “shader permutation” problem. (Shader is the term used in computer graphics to mean the program running on GPUs. In ML/compute world, we typically call it kernel.) The problem is quite common in games where one may want to render a scene/object with different parameters so it ends up one may create many shaders that are quite similar. This is not limited to graphics and games. When compiling some ML model certain high-level ops can be CodeGen’ed into similar SPIR-V logic. For example, a reduction kernel performing addition and multiplication should be CodeGen’d to the same SPIR-V module except for the underlying reduction operation. So in general, having a module combiner that inspects the input SPIR-V module’s logic and promote to use suitable SPIR-V features are quite useful for reducing the number of SPIR-V modules an application (game or ML/compute) need to ship.

There are many things we can do to improve the combination and some require research heuristics and trade-offs. But an immediate step to get started is to introduce a function that takes a list of spv.module ops, collects all contents in those modules into one spv.module op, and returns it. The next step would be to compare the logic of spv.funcs and deduplicate if they are identical. Following that, we may want to explore like comparing similarity and promote to use specialization constant where suitable. One can also go further to explore function outlining and other mechanisms to reduce SPIR-V module logic duplication.

For the immediate step, I’d recommend you to take a look at the SPIR-V dialect’s documentation and poke around the SPIR-V code (the doc explains the code structure and provides links) to get familiar with how SPIR-V are represented in MLIR world. Once you’ve enough familiarity, let me know and I can provide more detailed guidance on how to work with the codebase with the new features.

Yes, I have a few other interesting SPIR-V related projects. One would be a SPIR-V to LLVM conversion. Having this will allow us to be able to leverage LLVM to compile SPIR-V into CPU code or JITing SPIR-V; both would be very cool. Another one would be to converting SPIR-V to WGSL, WebGPU’s new shading language. WGSL is designed to be trivial translatable to SPIR-V. So it can be seen as a “textural format” for SPIR-V in a sense. But this one is much more work given that we’d like an AST dialect for WGSL.

Let me know what do you think. I’m happy to explain more if anything unclear.

BTW, do you want to target this year’s Google Summer of Code?

I will continue then looking at SPIR-V dialect’s documentation and the will also have a look at the code to get familiar with it. I will let you know shortly how it goes!

Yes, I was thinking of targeting this year’s Google Summer of Code as well!


I looked at dialect’s documentation and currently looking at the codebase to get familiar with it. Also, I am wondering if you know where I can find some further reading about the “shader permutation” problem?

Thank you for the clear explanation of the first steps. I really like the idea of going further to identify similarities in functions and removing the duplication.

Regarding the GSoC, I am wondering if you have any advice about the proposal I have to make?

In the meantime, I will continue exploring SPIR-V’s spec and the codebase, and hopefully will be able to work with it soon!

If you are targeting GSOC, have you thought about the “SPIR-V to LLVM conversion” project? The aim is to have a conversion path from the SPIR-V dialect to the LLVM dialect in MLIR. We already have exporters to go from LLVM dialect to LLVM proper so this connects SPIR-V with the great LLVM ecosystem. Having such a path will yield great benefits. For example, we will be able to compile SPIR-V modules down to CPU machine code or JITTing SPIR-V. With it I can compile a Vulkan application into CPU machine instructions entirely. In reality, this also can be interesting to software-rendered Vulkan implementation like SwiftShader or LLVM-based Vulkan driver compilers, if the solution become complete and mature enough.

I asked because I feel “SPIR-V to LLVM conversion” is more aligned with “Summer of Code” because it’s more of a coding project. It also has relatively clearer solution and implementation steps and deliverables. Examples are abundant: there are many conversions in MLIR core at the moment. For the semantics of LLVM/SPIR-V and the differences, we are quite happy to help. While the SPIR-V module combiner’s solution is unknown (except the first few steps) and may engage some research. But ultimately it’s your call, depending on your interest and preference. Both are great value add and I’m happy to help either way. :slight_smile:

Regarding GSOC proposals, I’d suggest you to search LLVM mailing list archive for examples and read a few to get some idea. We can iterate later.

BTW, could you provide some introduction over yourself or give some links to your personal website or whatever (if you have) so I can know you better? :slight_smile:

SPIR-V to LLVM conversion sounds interesting and I think it’s a great opportunity to contribute to many other applications (e.g. Vulkan application as you mentioned). Also, it is a good chance to learn more about LLVM as it is so widely used.

Since this project is more aligned with the Summer of Code, I am happy to try it, especially taking into account its great use if it’s done successfully. Regarding the SPIR-V module combiner, I am also happy to help and do some research (maybe later?).

Thank you for your advice, I will then have a look at the old proposals and will let you know what I think.

Speaking of myself, I am George Mitenkov, in my 2nd year studying Maths and Computer Science at Imperial College London. This year I had a very interesting Compiler course at university, and as a result I decided for myself that I want to dive deeper in this topic. I think that open-source is a great place to do that as I can contribute to the shipping of new code/software, get lots of useful experience and knowledge, and also meet many people within the open-source community. I got particularly interested in MLIR since this is a relatively new area with many opportunities and benefits for ML frameworks, high-performance computing, etc.

I am only starting to contribute to the open-source, however I have experience that I believe is useful and will help me a lot with this. I have studied Java, Python, Haskell and C and have successfully completed several big courseworks where I had to work with a huge codebase, test and sometimes develop software from scratch in a group or individually: writing an OS in C (scheduling, synchronisation, syscalls, etc.), WACC language to ARM compiler in Python and ARM Assembler in C.

I don’t have a website (yet), but here is a link to my Linkedin if this helps.

Great! Thanks for the info! I’d suggest you to put up a draft proposal a bit early so we can iterate on it collaboratively. Please don’t hesitate to ask if you have any questions!

Just realized I haven’t answered your question regarding shader permutation. It’s a combinational result of games’ complexity, game engine resource and effect management, sometimes suboptimal shader toolchain support, and historical reasons. To properly understand it, one needs to have some exposure gamedev. But still if you want to get some understanding of it, here is a post; might be a bit old but talks about the problem. This one might also worth a look. I don’t know other good resources off my hands. Hope this helps.

Thanks! I have started working on it.

These links seem to be helpful, thank you!


I have a general question about the SPIR-V to LLVM conversion. Won’t it be a downside to lower SPIR-V directly to LLVM (rather than have some IR in between), and therefore losing possible optimisations that could have been performed at the intermediate stage (via linalg for example)?

Also, I have been looking at operations defined at SPIR-V and their semantic analogies in LLVM, trying to have a one-to-one mapping from one to another if possible. I have a question related to SPIR-V’s OpUndef, since I didn’t find a corresponding instruction in LLVM and not sure how to convert it preserving semantics. I found out that LLVM has ‘undef’ value, but this is not an instruction and I am not sure what has to be done in this case.

Good questions!

Actually both SPIR-V and LLVM are low-level dialects if we think about the whole CodeGen picture, especially for ML’s use case. At the highest level, we have dialects for modelling TensorFlow ops. That goes to middle level dialects like HLO and then to Linalg and then to loops. The last step is generating SPIR-V or LLVM. This is how the pipeline looks like for Vulkan/SPIR-V CodeGen in IREE, for example.

Each level of abstractions has its purpose and suitable transformations to perform. Performing transformations at a high-level can indeed be simpler for typical cases. But when what we are given is already SPIR-V, promoting back to middle-level abstractions like Linalg can be hard. Note that going from some ML frontend is just one possible way to generate (and we do like to apply optimizations at higher level abstractions as much as possible); there are other SPIR-V generators like going from HLSL via DXC or GLSL via Glslang. The SPIR-V dialect in MLIR aims to serve various use cases SPIR-V can serve (although we are only focusing on Vulkan compute at the moment :wink:). It has the ability to directly import some SPIR-V blob and deserialize it into a spv.module. So there are cases where one start directly with SPIR-V.

The main purpose of converting SPIR-V to LLVM is not trying to optimizing SPIR-V by leveraging LLVM opimtizations. MLIR enables the ability to properly model SPIR-V concepts. Optimizations for SPIR-V should be happening in the SPIR-V dialect. The main purpose for now is to leverage the ability of LLVM to CodeGen SPIR-V to CPU machine code or JIT. :slight_smile:

OpUndef is indeed a bit tricky; we don’t need to handle this op at the very beginning. :slight_smile: (But it’s very nice that you noticed the difference and started to think about it!) I suspect both the LLVM and SPIR-V side the semantics could evolve; but for the current moment, one possible way is to replace every usage of an OpUndef with an LLVM undef value, given OpUndef's semantics as “Each consumption of Result <id> yields an arbitrary, possibly different bit pattern or abstract value resulting in possibly different concrete, abstract, or opaque values.” IIUC, the llvm.mlir.undef op in the LLVM dialect actually materializes a different undef value at every use when converting to LLVM IR proper. So it means we convert one spv.Undef op into one llvm.mlir.undef op. We had some (lengthy) prior discussions regarding OpUndef here if you’d like to read into details.

You can also lower OpUndef as freeze(poison), which is an instruction.

Okay, I see! Thank you for the clarification.

Regarding OpUndef:
In the documentation it was mentioned that llvm.mlir.undef op goes with the specified LLVM IR dialect type that wraps an LLVM IR structure type. But I see that it can actually be considered as a proper conversion.

Thanks! :slightly_smiling_face: