[RFC] Better support for typed pointers in an opaque pointer world

Motivation

There are a few tools in the LLVM world that need to convert LLVM IR—which is now moving to opaque pointers—to an IR whose pointer types are still typed. The main ones I’m aware of are the DXIL backend and the SPIR-V backend. In an opaque pointer world, it is necessary to recover the types of pointers to be able to properly emit these IR, and getting these types wrong can result in code that just plain doesn’t work.

To be clear, this doesn’t seek to roll back opaque pointers, or enforce everybody else in LLVM to worry about the burden of typed pointers for a few small backends. As it turns out, in my work on converting the LLVM-to-SPIR-V translator to support opaque pointer IR, I have found that I can get most of the way there with the existing features that LLVM has. The following code is a sketch of the translation process that highlights the basic philosophy:

PointerUnion<Type *, DeferredType *> getPointerElementType(Value *V) {
  if (auto *GV = dyn_cast<GlobalVariable>(V)) {
    // Some values have obvious types…
    return GV->getValueType();
  } else if (auto *Select = dyn_cast<SelectInst>(V)) {
    // … others propagate types…
    return getPointerElementType(Select->getFalseOperand());
  } else if (isa<IntToPtrInst>(V)) {
    // … and some have no intrinsic type, and must be inferred from use.
    return new DeferredType(V);
  }
}

// NB: Preparatory pass needed to convert all constant expressions into instructions
void typeCheckInstruction(Instruction *I) {
  SmallVector<std::pair<Use *, Type *>, 4> Uses;
  // For each operand, collect the correct type at each use site.
  collectUseTypes(I, Uses);
  for (auto &Pair : Uses) {
    if (getPointerElementType(*Pair.first) != Pair.second) {
      // Create a bitcast ptr %arg to ptr so we can give it a different type.
      CastInst::CreatePointerCast(*Pair.first, Pair.first->getType());
    }
  }
}

If the goal of opaque pointers is to make LLVM passes generally not have to worry about spurious bitcasts obscuring pointer sources, then the primary task of recovering typed pointers is to reinsert where those bitcasts would have been. And, for the most part, putting those bitcasts in different places than where they used to be is not an issue. However, there are a few cases where this is unsafe.

The first major class of issues is function parameters. Especially when you make a function call that crosses a module boundary—whether it’s the case of the kernel whose caller will be the driver code, or the case of a function declaration whose definition will be in another library—it creates issues if the function types are not the same. The linking procedures for these libraries can’t link a function foo if one module says it takes an i32* parameter and another says it takes an i8* parameter. These languages are designed without mandatory support for function pointers, and handling mismatched arguments by bitcasting from one function type to another is not possible. It is thus essential for frontends to indicate the correct function types so that the backends can translate them correctly.

The second major class of issues is that OpenCL has several opaque types (such as image, event, sampler types) that need to be precisely preserved through to the end of the translation process. In the SPIR-V backend, it is simply not possible to express a bitcast to or from these types. These types are presently represented in the SPIR-V target as pointers-to-opaque-structs with a particular name, and in a typed pointer world, LLVM generally tries sufficiently hard enough to represent these types that it doesn’t cause issues. But when opaque pointers are enabled, optimization passes are less eager to throw away this information—I’ve already observed one optimization pass shrugging its shoulders and replacing the pointer type with an i64, leaving me with IR that is simply not possible to codegen.

Because of these issues, I would like to propose two modifications to LLVM IR to make it possible to handle generation of SPIR-V, DXIL, and similar backends in an opaque pointer world. In addition, there are two changes to LLVM I also propose that would help the translation process but are not themselves necessary.

Better type information at function boundaries

The first change I want to propose is to allow the elementtype attribute to be present on non-intrinsic functions, while also expanding its scope to also be a return attribute. This would allow front-ends to indicate what the correct pointer element type ought to be for the function parameter or return value (note that support for return values is as important as parameters here). I’m proposing to reuse the elementtype attribute here on the basis that a) it imparts no other special semantics other than “this is what the pointer element type would be were the pointer typed” and b) it requires minimal modifications to LLVM to add this support (a few changes to the verifier, and a tweak to Attributes.td).

Alternatives

As mentioned previously, something along the lines of this change is going to be necessary. There is a draft patch to convey this information via metadata here: ⚙ D127579 [clang][WIP] add option to keep types of ptr args for non-kernel functions in metadata, but, as discussed in the most recent GPU working group meeting, that approach has flaws. It relies on emitting types in the front-end representation, which is somewhat inconvenient to reparse (and more difficult for frontends that do not start with OpenCL code). It also relies on conveying information via metadata, which is more liable to being dropped by an optimization pass. In any case, maintaining information essential to correctness in the form of metadata constitutes an abuse of the purpose of metadata, as dropping metadata isn’t supposed to cause the program to break.

Another alternative that was brought up in the patch is conveying type information via mangled names. While it is possible to recover the type information via a mangled name, this requires building in an Itanium name demangler into the backend infrastructure to recover relevant information, and handling the maximum possible insanity of name manglings. While LLVM does have an internal demangler that can be leveraged, going from “here’s a mangled string” to “this is the type of every parameter” is still challenging with that API. The LLVM-SPIRV translator contains a use of that API to recover a small subset of important types for various optimization passes—it would take a fair bit more work to extend that to generally recover all the types that would be needed to properly generate function types.

Opaque types in LLVM

Another change I would like to propose is to add a new system of opaque types to LLVM. Effectively, we would add a new way kind of opaque type, e.g., opaque("opencl.sampler_t"), that would replace what is currently handled in SPIR-V as pointers-to-opaque-structs. These types would be usable as first-class types, and could exist as SSA values, function parameters or return types, members of struct types or array types. It would be possible to pass these arounds as values through phi statements, or allocate them via alloca and load/store them as appropriate. It would not be possible in general to bitcast to or from these types.

Ideally, it would be nice to specify these types as having an unknown size, much like scalable vector types, but if that is not workable; it is probably sufficient to have their type definition ascribe a nominal size to them (say, the size of a pointer).

It was pointed out in the most recent GPU working group meeting that it would also be nice to have the names be parameterizable: the SPIR-V Image type contains 8 different parameters (sample type, dimension, depth, arrayed, multisampled, sampled, format, and access kind), but it isn’t that big of a deal to encode these parameters into a single string (which is already what happens today these types).

I’m not entirely certain what level of constant support is needed for these types. A zeroing initializer like ConstantPointerNull or ConstantAggregateZero is needed (for at least some of these types). Similarly, having undef and poison values for these types are plausible and may indeed be necessary for our memory semantics anyways. There are, I believe, a few cases where more complex constant parameters for these types may need to be constructed, but it’s possible that this can be worked around: I have yet to do any prototyping on this aspect of the proposal.

Semantically, these types and values of these types would generally be unintrospectable by a target-independent pass. Targets could ascribe particular meanings to particular opaque types, and therefore optimize them as they see fit.

Since I haven’t yet attempted to prototype this, there may be roadblocks and other issues that I haven’t considered. I am aware that this change is likely to be contentious, so I wanted to gauge the willingness of the community to potentially move in this direction before taking the effort necessary to start prototyping. I would imagine, however, that various kinds of opaque types might be useful for other projects. Indeed, the existing x86_amx and x86_mmx types might be replaced with an x86-specific opaque type were this part of the proposal accepted.

Alternatives

For alternatives, there is first the existing approach of pointers-to-opaque-structs with specific names. Given the general semantics of LLVM, this approach was always somewhat problematic. But with opaque pointers, pointer manipulation become almost completely agnostic of pointer types and the ability to preserve opaque struct types as pointer element types itself vanishes—it would not surprise me if someone were to propose removing opaque struct types altogether once typed pointers are removed. I do not believe this alternative is even remotely feasible.

The other major alternative that has been discussed is representing these opaque types as pointers in different address spaces. In a simple experiment I conducted (after discovering that existing passes caused one of my test cases to generate an unacceptable ptrtoint instruction), I found that using a separate address space was insufficient to keep the type maintained as a pointer and not an integer, although I did not do a follow-up test to see if indicating the address space as being a non-integral pointer type. Making these address spaces be non-integral is a possibility, but given that the parameterization of some of these types means we would require a few hundred thousand address spaces at least, we would at least need changes to the data layout string representation to compress the specification to something more manageable.

A typing pass infrastructure

This is in the “nice to have” category. Given that there are multiple backends that are going to want this typing functionality, it would be nice if there were a common infrastructure that anyone who wanted types could use. While the implementation of such a pass can be rather straightforward, there is a nontrivial amount of work needed to identify that, for example, the ptr-valued extractvalue instruction needs to have the same type as the second parameter of the cmpxchg instruction whose result the extractvalue is using (I daresay that many readers would not have thought this a case that needed to be handled).

Presently, there are two independent implementations of this that I am aware of. The first is the analysis that @beanz has created for the DXIL backend. The second is the typing pass I have created for the LLVM-SPIRV translator. The latter is a more complete effort, since I could leverage a fuller existing test suite to uncover more corner cases in the IR effort.

Typed pointer types

In my current implementation, I’ve represented pointer types largely by means of an llvm::Type* representing the pointer element type (together with the original llvm::Type*, which contains the address space information). This approach does not scale to multiple levels of indirection (but needing to support multiple levels is unnecessary). More importantly, it does not scale well to where types are embedded in other types, especially function types (although struct types may also be a point of concern).

Representing a type system with pointer types properly effectively requires duplicating a decent fraction of the LLVM type system. Having a special pointer type allows existing infrastructure to represent the correct types of pointers in nested type specifications, such as in function types, vectors of pointers, or struct types. A side benefit is that it makes it easier to indicate when a use of a pointer type has a different pointer element type from its definition, as you could use bitcasts with different types (while it is possible to generate no-op bitcast ptr to ptr instructions, it is not possible to do so for constant expressions). The DXIL backend for LLVM has opted to create its own extension to llvm::Type for this purpose in ⚙ D122268 Add PointerType analysis for DirectX backend, and it would be nice to generalize this approach for other targets.

My expectation is that such a new typed pointer type would not be generally legal to use in LLVM—the validator would consider any IR that used such a type to be invalid. Instead, this type would be used in limited form in conjunction with the also-proposed pointer typing pass to represent the output.

This proposal also fits in the “nice to have” category; it’s not necessary for such a type to exist to do the typing. However, without such a type, some amount of contortion becomes necessary to avoid having to deal with the extra necessary infrastructure. For example, the implementation pass I wrote uses a PointerUnion<Type *, Value *> to be able to express pointer-to-pointer references.

3 Likes

@jcranmer, thank you for putting this up!

One thing that I’ve said a lot lately is that I’m very concerned that DXIL and SPIR-V are duplicating independent paths to handle cases that could be shared. Pointer type analysis is a big area where SPIR-V and DXIL need to do more or less the same things.

Being able to extend elementtype attributes to apply to function parameters and return types generally would radically simplify a number of cases for HLSL, and seems like a pretty small targeted change. It would also be much more manageable than squirreling away metadata entries.

I also really like the idea of more general opaque types. HLSL also has a number of special-purpose types that can’t be cast around but would be extremely useful to represent as opaque.

In writing the pointer type analysis for the DirectX backend I pretty quickly identified the need to represent typed pointers in the LLVM type system (not attached to any IR constructs). It radically simplifies the problem for multiple levels of indirection and function types, both of which are important for DXIL. Having a TypedPointerType in the IR library and a shared typing pass infrastructure would be a big win.

Overall, a big +1 from me :slight_smile:

I don’t see how this helps, if you need to reconstruct the full type (which it sounds like you do). E.g. if you had previously, an argument of type i8**, that would now be simply ptr. Your proposal would then result in it becoming ptr elementtype(ptr). But that still isn’t good enough.

It also relies on conveying information via metadata, which is more liable to being dropped by an optimization pass.

There is precedent for requiring that metadata not be dropped from global values, so that doesn’t seem terribly worrying.

This was my first thought as well. It seems like this would only actually work in conjunction with allowing “Typed pointer types” inside elementtype, at which point those would have to be part of ordinary IR, which we probably wouldn’t want.

FWIW, people using elementtype for mass annotation was a primary concern when this attribute was originally added (see e.g. ⚙ D105407 [LangRef] Add elementtype attribute), and the restriction to intrinsics was added specifically to prevent this.

I’ve asked this on ⚙ D127579 [clang][WIP] add option to keep types of ptr args for non-kernel functions in metadata before, but didn’t get a clear answer: Ignoring the special opaque types, is it possible to always use i8* arguments in function signatures, and only introduce bitcasts of those inside the function (even if it is known that an argument will only be used as i32* for example)? Or is there a requirement that LLVM-generated SPIR-V gets linked with other SPIR-V that does not use i8* everywhere?

But if that doesn’t work, then the metadata or name mangling approaches sound like the only way to reliably transfer all the necessary information here. I think an obscuring factor here is that DXIL and SPIR-V just happen to have type systems that are very close to LLVMs pre-opaque-pointer type system (because they are LLVM-based designs), so it’s very convenient to translate from historical LLVM IR types to DXIL/SPIR-V types, and harder to do so from frontend types. However, the LLVM type system can diverge further in the future – for example, what if we were to drop struct types at some point (not going to go into details here, but this is more viable than it sounds)? Or maybe we want to drop function types and replace them with some non-type-based ABI info (that also integrates ABI-affecting attributes, rather than just types)?

So with elementtype annotations, we already have a problem with representing pointer-to-pointer types (or pointer-to-aggregate-containing-pointer types), but going forward we might also have a problem representing pointer-to-struct types or pointer-to-function types, etc. This is why I suspect that a metadata/mangling encoding of frontend types is likely the correct approach. (Or possibly, the frontend directly encoding the right DXIL/SPIR-V types to use, rather than that being derived later – but ultimately something that starts with the frontend types, not with reconstructed LLVM IR types.)

Regarding opaque types, I’m not sure I understand what is being proposed here. It would help to have some IR examples of what is being done currently, and what you propose to do instead. From your description, it sounds like representing these as named struct types passed by value (rather than by pointer) would work (in conjunction with some custom ABI for passing them), though probably I misunderstood the requirements here.

Edit: If you’re seeing problematic ptrtoint instructions being introduced, I’d encourage you to figure out what the root cause for these is, so we can possibly fix it. We really don’t want ptrtoint to be introduced as part of transforms for other reasons as well (provenance-related issues), so if we can avoid that, it would be great. There are some known places where this can happen (e.g. memcpy converted to load+store of integer), but it would be interesting to know which one is responsible here and what the relation to opaque pointers is (because I don’t think they should have any direct impact on this).

Sharing the infrastructure for determining pointer types between DXIL and SPIR-V sounds like a good idea.

Thank you so much for sharing this RFC. I think the problem you’re addressing is sufficiently close to a challenge we have in supporting WebAssembly GC types that it would be worth combining forces. I’d spotted previously that there were some related work in DXIL and SPIR-V, but until this RFC it wasn’t clear quite how great the overlap is. I’ll say a little about our requirements below and give a few comments on some of the suggestions in this thread, but it would be great to meet with you all to discuss further - either at the next GPU working group meeting (mid July?) or at an earlier special-purpose meeting.

Future WebAssembly specifications introduce a rich set of “GC” types, e.g. structs and arrays with a memory representation that is opaque to LLVM (typically these would represent values stored on a GC managed heap, e.g. within a host Javascript environment). There are all sorts of semantic restrictions on these values stemming from the fact they can’t be load/stored to standard linear memory, though I think that doesn’t impact this discussion too much. Just like you do, we have a requirement that these precise types are maintained all through from the frontend, through LLVM IR and in our case through instruction selection and the MC layer so that globals, locals, function signatures etc are all emitted with the correct type. In our case, operations on values of these types will occur exclusively through intrinsics.

I’ve been exploring options in a very similar direction to introducing a new “opaque type” construct. Here are some notes on my thinking and experimentation so far:

  • Similar to @nikic’s concerns, I’m not sure that looking to encode the Wasm type system in LLVM IR is future proof or desirable. That’s not a very strongly held view though, so I’m open to revisiting. My version of opaque("opencl.sampler_t") looks more like refty typeid(1234) where there is an assumption that there is module-level metadata containing a target-specific representation of these opaque types that can be used at code emission (and would also need to be used if merging LLVM modules).
    • The only reliance on metadata would be for this type table at the module level. There’s precedence for module-level metadata not being dropped, so this seems much more reasonable than e.g. attaching metadata to function arguments and hoping for the best.
    • In our case you could almost get away with just always passing an i32 with the typeid to all the intrinsics that operate on these types. But that would leave a gap in the handling of types of globals and function arguments / return types.
  • Just for purposes of prototyping something more rapidly, I’ve started out with an address space hack. Assume all AS above 255 are non-integral and use pointers to those address spaces to represent values with a certain typeid. There are some fiddly issues around access these type IDs in the backend due to some assumptions in the lowering infrastructure in general and the Wasm backend specifically (e.g. that you can always freely represent a Wasm type as a MachineValueType, and sadly there’s no real scope for having a MVT with a typeid attribute). So my focus has been there rather than thinking more broadly about IR-level representation.
    • If this reservation of ASes could be made target-specific rather than my hack, it might even be an alternative worth considering vs introducing a new type.
  • In our case, having something like refty typeid(1234) / opaque("opencl.sampler_t") would mean there’s no need for changes to elementtype, as we’d pass that new type by value and so it would never be obscured by opaque pointers.
  • Given our heavy use of intrinsics, the fact these aren’t parameterisable by type. We’d need the frontend to introduce casts so e.g. the return value of @llvm.wasm.struct.get is converted to the specific reftype that field is known to hold. If of by-value structs were used, this would presumably need a alloca+store+load, or changing bitcast to allow it to be used on aggregate types, or introducing a new instruction. Plus of course we’d need intrinsics that accept any struct type.

I’ve only given a narrow view of Wasm GC / reftypes here, so hopefully I’ve managed to summarise relevant parts rather than introducing confusion with incomplete explanations. Do speak up if that’s not the case!

+1 for trying to tackle this. A couple of thoughts about opaque types:

We definitely need poison and should allow undef for consistency, but remember that overall, we ought to move away from undef.

Having zeroinitializer or some sort of defaultinitializer would be convient, but I don’t think it’s strictly required. Anything we really need should be doable with intrinsics that materialize constants. (This doesn’t allow global variable initializers, but we can probably live without?)

It’s close, I think, but named struct types are still limited because they don’t support type parameters.

For background, a fairly extreme example is the SPIR-V image type. What would be nice to have is something like:

%imgf2D = type "image" (type float, # sampled type
                        i32 2,      # dimensionality
                        i32 42)     # image format (really an enum of stuff like rgba8)
                                    # real SPIR-V has even more type parameters here
%imgi3D = type "image" (type i32,   # sampled type
                        i32 3,      # dimensionality
                        i32 9)      # image format (really an enum of stuff like rgba8)

@desc.set = global { %imgf2D, %imgi3D }

...

# used in code like (forgive my not knowing what the SPIR-V backend actually does):
%desc.ptr = getelementptr { %imgf2D, %imgi3D }, ptr @desc.set, i32 0, i32 0
%desc = load %imgf2D, ptr %desc.ptr
%texel = call <4 x float> @spirv.image.load.<mangle>(%imgf2D %desc.ptr, <2 x i32> %coords)

Thank you all for the replies so far. Since there are some questions about how SPIR-V is represented in LLVM today, I’ll start with that. A fuller description of these details can be found at SPIRV-LLVM-Translator/SPIRVRepresentationInLLVM.rst at main · KhronosGroup/SPIRV-LLVM-Translator · GitHub.

The following is an example of the existing (typed pointer) LLVM representation for SPIR-V (taken from the existing llvm-spirv test suite):

%spirv.Image._float_1_1_0_0_0_0_0 = type opaque; read_only image2d_depth_ro_t
%spirv.Sampler              = type opaque ; sampler_t
%spirv.SampledImage._float_1_1_0_0_0_0_0 = type opaque

define spir_func void @test_sampler(%spirv.Image._float_1_1_0_0_0_0_0 addrspace(1)* %srcimg.coerce, %spirv.Sampler addrspace(1)* %s.coerce) {
  %1 = tail call spir_func %spirv.SampledImage._float_1_1_0_0_0_0_0 addrspace(1)* @_Z20__spirv_SampledImagePU3AS1K34__spirv_Image__float_1_1_0_0_0_0_0PU3AS1K15__spirv_Sampler(%spirv.Image._float_1_1_0_0_0_0_0 addrspace(1)* %srcimg.coerce, %spirv.Sampler addrspace(1)* %s.coerce) #1
  %2 = tail call spir_func <4 x float> @_Z38__spirv_ImageSampleExplicitLod_Rfloat4PU3AS120__spirv_SampledImageDv4_iif(%spirv.SampledImage._float_1_1_0_0_0_0_0 addrspace(1)* %1, <4 x i32> zeroinitializer, i32 2, float 1.000000e+00) #1
  ret void
}

declare spir_func %spirv.SampledImage._float_1_1_0_0_0_0_0 addrspace(1)* @_Z20__spirv_SampledImagePU3AS1K34__spirv_Image__float_1_1_0_0_0_0_0PU3AS1K15__spirv_Sampler(%spirv.Image._float_1_1_0_0_0_0_0 addrspace(1)*, %spirv.Sampler addrspace(1)*)

declare spir_func <4 x float> @_Z38__spirv_ImageSampleExplicitLod_Rfloat4PU3AS120__spirv_SampledImageDv4_iif(%spirv.SampledImage._float_1_1_0_0_0_0_0 addrspace(1)*, <4 x i32>, i32, float)

The SPIR-V output that would correspond to this LLVM IR looks roughly as follows [1]:

         %38 = OpTypeImage %float 2D 1 0 0 0 Unknown ReadOnly ; Declares a SPIR-V image type
         %39 = OpTypeSampler ; Declares a SPIR-V sampler type
         %40 = OpTypeFunction %void %38 %39 ; Declares a SPIR-V function type void (*)(Image, Sampler)
         %45 = OpTypeSampledImage %38 ; Declares a SPIR-V sampled image type, whose underlying image type is %38
     %v4uint = OpTypeVector %uint 4 ; Declares a SPIR-V <4 x i32> type
         %49 = OpConstantNull %v4uint ; Declares <4 x i32> zeroinitializer
%test_sampler = OpFunction %void None %40
%srcimg_coerce = OpFunctionParameter %38
   %s_coerce = OpFunctionParameter %39  ; These three lines declare a function declaration void @test_sampler(Image, Sampler)
         %46 = OpSampledImage %45 %srcimg_coerce %s_coerce
         %51 = OpImageSampleExplicitLod %v4float %46 %49 Lod %float_1 ; These two lines are SPIR-V instructions
               OpReturn
               OpFunctionEnd

Since I expect few people here can read SPIR-V, I’ve annotated some of the lines to explain what they’re doing. There are a few things to note here.

As @nhaehnle pointed out, image types have a lot of parameters to them. These parameters today use suffixes on a special struct name in LLVM to encode these parameters. This actually isn’t the only representation of these types, however: there’s a different set of LLVM struct names that correspond to OpenCL types, another potential set of LLVM struct names for SYCL type names, and the Itanium name mangling representation for each of these cases is usually different again from LLVM struct names. There exists passes in the llvm-spirv translator tool that will rewrite type names to the SPIR-V representation to handle this diversity of names.

The second thing to point out is that many of the custom SPIR-V instructions are represented as calls to unknown functions. For example, the OpSampledImage instruction is modeled in LLVM as a call to __spirv_SampledImage. These are generally morally equivalent to LLVM intrinsics, but they’re not actually implemented as LLVM intrinsics. It is for these functions in particular where I would like to have the elementtype parameter added.

My idea as to what this would look like with opaque types–stealing @nhaehnle’s syntax here (I’m not wedded to particular syntax)–would be like as follows:

%image2d = type "spirv.Image"(type float, i32 1, i32 1, i32 0, i32 0, i32 0, i32 0, i32 0)
%sampler = type "spirv.Sampler"
%sampled_image2d = type "spirv.SampledImage"(type %image2d)

define spir_func void @test_sampler(%image2d %srcimg.coerce, %sampler %s.coerce) {
  %1 = tail call spir_func %sampled_image2d @_Z20__spirv_SampledImagePU3AS1K34__spirv_Image__float_1_1_0_0_0_0_0PU3AS1K15__spirv_Sampler(%image2d %srcimg.coerce, %sampler %s.coerce) #1
  %2 = tail call spir_func <4 x float> @_Z38__spirv_ImageSampleExplicitLod_Rfloat4PU3AS120__spirv_SampledImageDv4_iif(%sampled_image2d addrspace(1)* %1, <4 x i32> zeroinitializer, i32 2, float 1.000000e+00) #1
  ret void
}

declare spir_func %sampled_image2d @_Z20__spirv_SampledImagePU3AS1K34__spirv_Image__float_1_1_0_0_0_0_0PU3AS1K15__spirv_Sampler(%image2d, %sampler)

declare spir_func <4 x float> @_Z38__spirv_ImageSampleExplicitLod_Rfloat4PU3AS120__spirv_SampledImageDv4_iif(%sampled_image2d, <4 x i32>, i32, float)

The ability to parameterize opaque types is useful here (especially with regards to the SampledImage type!), but it’s not really necessary, especially If the implementation complexity of supporting type or integer parameters is too expensive. In practice, it’s possible to encode everything as integers–even the first type of spirv.Image would be a small, enumerable set of possible values.

From your description, it sounds like representing these as named struct types passed by value (rather than by pointer) would work (in conjunction with some custom ABI for passing them), though probably I misunderstood the requirements here.

Using named struct types in lieu of opaque types is not feasible, I believe. If the struct type is opaque, that eliminates the ability to use them in the cases where they are most critical. If the struct type is not opaque, I worry that LLVM optimizations are all too likely to fail to preserve the struct type. My take on LLVM is that a struct name is semantically meaningless, and I am highly wary of a design that requires relying on transformation passes preserving LLVM struct names. One of the ideas behind proposing an opaque type is to generate a type where the name of the type is semantically meaningful. Additionally, if (as you suggest) struct types go away in a few years ago, then we’d need to come right back to this question at that point anyways to get these types passed down to the ultimate backend. Better not to rely on a path whose future is itself shaky, IMHO.

While the example I gave doesn’t demonstrate it, I think there are some other necessary features for an opaque type:

%sample = type "backend.Opaque"()
; It would be an error to include the following line:
; %sample2 = type "backend.Opaque"()
; That is, the string that matters is the "backend.Opaque" (with arguments), not the %sample/%sample2

@global_var = %sample zeroinitializer ; Declare global variables?

declare i32 @llvm.backend.intrinsic(%sample) ; Needs to be a parameter
declare float @llvm.backend.intrinsic2.tbackendOpaque(%sample) ; Don't forget mangling for types in intrinsics!
define void @foo(i1 %cond) {
  %addr = alloca %sample ; Support allocas for opaque types, because that's how frontends avoid SSA
  %var = load %sample, ptr %addr ; load/store falls out from above
  %result = select i1 %cond, %sample %var, %sample zeroinitializer ; Can use in select/phi, other dataflow instructions
  %meow = call i32 @llvm.backend.intrinsic(%sample %result) ; or calls
  ; From other examples, allowing bitcast is probably necessary. But some backends might not be able to codegen all bitcasts. [2]
}

The main alternative to opaque types that I think is feasible would be using non-integral address spaces. Using address spaces has the advantage of not needing to modify the LLVM type system, but it also carries with it the baggage of assuming that the value is in fact a pointer and is usable in general pointer contexts. This includes notably the ability to use it in pointer arithmetic (i.e., geps), load/store to it, even construct a vector of weird address space types. On top of it, the existing address space infrastructure (for example, in the data layout string) tends to assume there’s “not many” address spaces, and packing opaque types as address spaces rapidly exhausts the space.

Rather like address spaces, my idea is that opaque types are semantically defined by a target, and a target-independent optimization or analysis pass is limited only to being able to reason about the dataflow of types. Asides from dropping any connotations of acting like a pointer, opaque types also allow types to be identified by a string (and potentially even richer) representation, which would make linking multiple LLVM modules far easier to do without having to encode target-dependent assumptions about how types work. A typeid-like construct (if I’m following @asb’s description correctly) runs the risk that on module makes typeid(1234) represent a different opaque type (say the %sampler type in my example) than typeid(1234) in a different module (say the %image2d type), as what type it “really” corresponds to is conveyed only by module metadata.

The biggest downside of this approach that I see is that it provides far less interface for a backend to communicate information, such as the size of an opaque type, to target-independent passes. As I’ve said before, the main existing interface for passing this information (datalayout) isn’t really set up already for passing the potentially large set of number of types that SPIR-V would need–I did a bitfield to pack all of the possible types for SPIR-V and ended up needing 25 bits [3], and that’s excluding one data type (a matrix type) I couldn’t fit in because I didn’t have tight bounds on its numeric fields for rows/columns.

If you’re seeing problematic ptrtoint instructions being introduced, I’d encourage you to figure out what the root cause for these is, so we can possibly fix it. We really don’t want ptrtoint to be introduced as part of transforms for other reasons as well (provenance-related issues), so if we can avoid that, it would be great. There are some known places where this can happen (e.g. memcpy converted to load+store of integer), but it would be interesting to know which one is responsible here and what the relation to opaque pointers is (because I don’t think they should have any direct impact on this).

I didn’t try too hard to track it further down, but I did identify SROA being the pass that did it. I suspect it comes about because SROA sees one use as a ptr and another as an i64 (generated, I guess, from a store (load ptr) being canonicalized to i64, but this is purely a guess from no effort expended to identify the source).

Your proposal would then result in it becoming ptr elementtype(ptr). But that still isn’t good enough.

As strange as it sounds, ptr elementtype(ptr) is probably good enough, especially if opaque types are no longer represented as pointers. In my testing so far on working on kernels, the biggest issues with pointer element types are around the use of what are effectively intrinsics being represented as functions and not LLVM intrinsics. For example, one of the functions in question might be __spirv_AtomicLoad, which is going to act quite alike load atomic. I need to know the type of the pointer in this function to generate the correct type of the operation. But just as load ptr, ptr %addr is perfectly fine to codegen as if it were load i8*, i8** %addr, so too could I see a __spirv_AtomicLoad(ptr %addr elementtype(ptr)) as __spirv_AtomicLoad(i8** %addr).

My working assumption in the entire opaque pointer conversion process has been to assume that only a single level of indirection is actually necessary. While this assumption hasn’t been entirely borne out, ring-fencing the double-indirection to only representing those operands directly used in memory instructions has so far proven sufficient. I still do need to do some more experimentation to confirm this fact fully, and this is partly stymied by my inexperience in the clang frontend preventing me from figuring out how to hack the clang frontend to generate elementtype on pointer-valued function operands for all functions.

However, the LLVM type system can diverge further in the future – for example, what if we were to drop struct types at some point (not going to go into details here, but this is more viable than it sounds)? Or maybe we want to drop function types and replace them with some non-type-based ABI info (that also integrates ABI-affecting attributes, rather than just types)?

I believe a loss of struct types [4] would not be fatal to backends like SPIR-V, although you might get some grumbling from the people who work on accelerator backends at the loss of specificity in type information. Dropping function types would likely be more fatal–but that applies not just to backends like SPIR-V that have a relatively high-level type information, but likely any backend that compiles to another IR format, such as PTX or WASM. For such cases, you’d still probably likely want some Type-like construct that can represent a pair of argument and result type vectors [5], and common infrastructure in mapping the ABI-based function type to the type-based ABI. Note that something like this is what I’m proposing in the final part of the RFC.

[1] If you’re interested in understanding the SPIR-V output better, the specification of SPIR-V can be found at SPIR-V Specification. Searching for “OpFoo” tends to be the fastest way to get to an explanation of any particular instruction.

[2] Bitcasts are a thorny topic. I see that x86_mmx and x86_amx explicitly permit bitcasts today, which suggests that opaque types should allow bitcasts. But I wouldn’t want, for example, SROA to automatically generate a bitcast if someone does something like type-pun via a union. In such a case, I’d prefer SROA to instead retain the IR as a store/load pair, as if someone tried to use a union to bitcast a double to a pointer (cf. Compiler Explorer).

[3] The LLVM LangRef states that address spaces have 23 bits. So, practically, this means representing these as address spaces requires even tighter bitpacking than a bitfield (there’s a couple of fields that have 5 possible values, which requires 3 bits of space to represent).

[4] The one case where struct types are still likely to be absolutely necessary is representing instructions or intrinsics that return multiple values, but I guess you’re assuming that LLVM truly supports multiple return values in that case.

[5] Note that today’s FunctionType only represents a result type vector indirectly, via a struct type.

1 Like

Thanks for the further information on SPIR-V representation. I want to try to take a step back to characterise what I think we would need to to achieve with opaque types for the SPIR-V/DXIL/Wasm (and more?) use-cases, to check we see the problem in the same way.

With LLVM’s current type system, you take your arbitrarily rich and complex frontend types, and convert them to LLVM types. This inherently loses information including the identity of the frontend types. A key thing the LLVM type system is trying to do is to provide enough facilities to encode the memory layout of any lowered types. For most cases this is sufficient, but we encounter problems when the compilation target is itself a typed IR which requires the identity of types specified in the frontend to be maintained throughout compilation to produce correctly typed output.

Trying to extend LLVM’s type system to support arbitrarily complex external type systems is a non-starter, so what is the minimum we can add? I think the key thing we need to support is maintaining type identity, which the opaque type proposal provides. A type is defined, LLVM may know very little about it (in the wasm case at least, the memory layout is completely opaque), but it does maintain its identity, guarantees that type information won’t be lost, that it won’t be cast to a different type, and other values won’t be cast to it. In some cases, additional information about these external types may enable more optimisations (you mention it would be useful in your use cases to access information such as size of types - is there other information you’d need to access in target-independent passes that might be essential to correctness or reasonable performance?).

In terms of encoding that type identity, I’d mentioned the integer ID as I’d found it a useful starting point for prototyping by using non-integral address space IDs. I think regardless of whether you’re using integer IDs or strings, you’ll still need the ability to have target-specific logic when linking different LLVM modules in order to ensure LLVM can correctly maintain type identity for external type systems. This logic would either need to rewrite typeids or type strings as appropriate: consider nominal types in different modules for instance. Possibly the opaque type proposal could be extended to support this (identifying imported/exported types etc) in the general case, but providing the minimal primitive and letting target-specific logic handle target-specific details for combining types between modules feels like it might be a better starting point.

Does anything above differ drastically to your own thinking?

Thanks for the detailed explanation! I think I understand the general need for opaque types now. It stands to reason that targets (especially non-CPU targets) may support additional types that are not part of the LLVM type system, and we should provide some kind of extension mechanism for such cases, with backend-specific lowering.

I think one bit that’s not completely clear to me is why the existing support for opaque types doesn’t already cover this. I initially thought that passing opaque types as function arguments or return values might be illegal, but apparently we do allow this kind of code:

%test = type opaque
define %test @foo(%test %arg) {
  ret %test %arg 
}

Being able to pass such a type as a function argument/return seems like it should be sufficient, as long as all operations on it are implemented using intrinsics. (Which seems like it would be necessary anyway, as there’s probably no common set of requirements or supported operations between different targets and use-cases.)

Is the concern here that given %t1 = type opaque and %t2 = type opaque, LLVM might decide to replace one of them with the other, because they are “equivalent”? It’s my understanding that this is not legal, and specifically the reason why we have a distinction between “identified” and “literal” structs, where only the latter are structurally uniqued.

Any particular reason why these aren’t treated as actual intrinsics, along the lines of llvm.spriv.sampled.image()? Note that LLVM considers anything in the llvm. namespace as an intrinsic, it’s not necessary for it to be part of the TableGen intrinsic infrastructure. I believe using an llvm. name would allow you to use elementtype attributes.

It mostly does. The thorniest question to me is how to preserve type parameters. @jcranmer showed how they can be encoded in the type name, but that puts us in the awkward situation of sometimes having to parse those type names to extract relevant information.

Hmm, that’s interesting. So you get an IntrinsicInst, but getIntrinsicID() doesn’t work properly, which is a bit strange. Could we perhaps add an extension here that allows registration of more intrinsic names and corresponding IDs?

Opaque struct types appear to be usable in more circumstances than I was expecting, but still not enough:

opt: test.ll:8:18: error: Cannot allocate unsized type
  %meow = alloca %test
                 ^
opt: test.ll:10:17: error: loading unsized types is not allowed
  %value = load %test, ptr %meow
                ^

As I mentioned previously, the frontend (principally Clang) is going to want to stick local variables in alloca to avoid having to do the SSA legwork itself, and being unable to use these types in alloca, load, and store makes them insufficient. Additionally, there are definitely people who want to store some of these in structs, although that is somewhat less of an issue.

Any particular reason why these aren’t treated as actual intrinsics, along the lines of llvm.spriv.sampled.image()? Note that LLVM considers anything in the llvm. namespace as an intrinsic, it’s not necessary for it to be part of the TableGen intrinsic infrastructure. I believe using an llvm. name would allow you to use elementtype attributes.

I did not originate this code myself, but I suspect the initial reason for not using actual intrinsics was to avoid having to modify LLVM to identify intrinsics. The original authors were probably unaware of the llvm.-but-not-registered being counted as an intrinsic, but as @nhaehnle points out, an IntrinsicInst with a not-quite-working getIntrinsicID() is weird. There is also a level of discomfort in actually calling functions with periods in their name–a lot of these functions are used by defining the list of functions as a C header file that is automatically included, and defining a C function named __spirv_SampledImage is far easier than llvm.spirv.SampledImage.

I see. So in summary, this is my understanding of the requirements:

  1. The type should be opaque and thus not bitcastable.
  2. The type should be identified, i.e. we need to distinguish different opaque types by name or at least ID.
  3. The type should be sized and thus usable in load/store/alloca.
  4. Ideally, the type should allow attaching additional information (such as type parameters or constants), but this is not strictly required.

The first two requirements are satisfied by current opaque types, but the 3rd one is not, as all opaque types are currently unsized.

Overall, this seems like a pretty straightforward extension.

Clang supports an asm attribute that can be used to link a C function declaration to an intrinsic. This functionality is explicitly unstable, so I wouldn’t necessarily encourage its use, but it provides an easy way to emit intrinsics as long as the header can be tightly coupled to the compiler version.