Problem using llvm:Type.getPointerAddressSpace

Really sorry if this is a dumb/n00b question!

I’m trying to debug this code…

    void addPointerParameter(llvm::Type *storageType) {

…it’s currently crashing when passed this type…

(lldb) e storageType->dump()
%T4main5Stats33_19C37605F80094DED36B2D41C8D0773FLLV = type <{ %Ts5UInt8V, %Ts5UInt8V }>

( %Ts5UInt8V is itself a simple struct containing an i8)
I would expect getPointerAddressSpace() to always work for all types and I sort of assumed it would read the datalayout then return the appropriate address space.

The code called seems to be from llvm/include/llvm/IR/DerivedTypes.h

unsigned Type::getPointerAddressSpace() const {
  return cast<PointerType>(getScalarType())->getAddressSpace();

In this case, getScalarType() leaves the type unchanged and cast<PointerType> crashes, presumably because this is a struct, not a pointer?

I’m wondering if I’m holding this wrong, or just have old (20 month old) code or (most likely) if I’ve fundamentally misunderstood something!?

Thanks for any advice or help anyone can give.


Genrally, in the Type class, getXYZProperty functions cast the type to an XYZ and then call getProperty, which is quite in line with what you’re seeing here.

Machines that have default address-spaces than 0 seem to do it for allocations of different purpose (alloca, global, and “program” from DataLayout), not by the type that’s being allocated there. So I don’t think Type has a function to get a good address-space for it.

The address space is a property of pointer types / vector of pointer types only. It doesn’t make sense to query it on a struct or anything else

Yeah. I get pointer address spaces are only meaningful with regard to a pointer. I think of them as “part” of a pointer.

I realised a bit more background might be helpful…

The platform/target I’m working on is AVR microcontrollers, which have two pointer address spaces, space 0 for pointers to objects, structs, etc. which are all stored in the microcontroller RAM.

Address space 1 on the llvm AVR backend is for function pointers, because program memory is stored in flash memory, which is completely separate from RAM.

I inherited some third party code and I’m trying to fix it to work with AVR.

The code I inherited was…

void addPointerParameter(llvm::Type *storageType) { ParamIRTypes.push_back(storageType->getPointerTo()); }

Which worked in 99% of cases, because it defaulted to address space 0 for the pointer, however this was breaking in my one case where the storageType was a function pointer. So rather stupidly I just added a call to get the pointer address space, assuming it would come with sensible solutions for any type somehow.

But the function isn’t called getPointerAddressSpaceForPointerTo()

So I see my mistake.

That’s sort of the function I need!

Really all I want is to modify getPointerTo to be aware of address spaces I think? At the moment it’s taking an address space as a parameter and that seems like bad or at least strange design. Why would the call site want to produce different types of pointer in these cases?

It feels like some internal logic should be able to say “you are trying to get a pointer to a function type… function types live in address space 1 on this architecture according to the data layout, so I’ll return a pointer in address space 1” or “you are trying to return a pointer to a simple struct, those live in address space 0 on this architecture according to the data layout, so I’ll return a pointer in address space 0”

Does that make sense?

This is specific to your narrow usecase. Other users don’t necessarily use address spaces in a simple mapping of specific kinds of values to specific address spaces. getPointerTo is really to deal with the pointee element type, which in the world of opaque pointers is pretty obsolete.

So you’re saying there’s (probably) no way I can fix this legacy code I inherited then? That’s probably useful info anyway… stop me wasting time on it!!

I don’t think you mean that, I think you mean storageType was a function type and so the result was a function pointer. You could easily adapt your code to check if storageType is a function type and pass the program address space to getPointerTo instead of using the implicit default of 0.

Because values of the same type can live in different address spaces. Sometimes these address spaces may even alias.

This looks like Swift’s IRGen to me…

Exactly right, thanks for clarifying. And I think you’re right again. That seems like the right sort of approach… I’m trying to adapt the code to work generally. But your idea has really cleared it up for me, thanks! Now I’m thinking the best approach is a new target specific virtual function that will work out the address space that would be needed if creating a pointer to the storageType. By default, it will return 0 and on the AVR Target, it will first examine the Type to see if it’s a Function and return the program/function address space in that case. I’ve just got to work out where/how to add this virtual function. Until now I’ve spent all my llvm time in lib/Target/AVR, so I’m quite new to the “wider” llvm world!

That’s really interesting! I really love learning about llvm. The most fascinating thing is the breadth of platforms and interesting uses of it. :smiley:

My job doesn’t usually ‘allow’ me to spend too much time learning llvm and fun stuff like this. Great when it does!

Spot on! I write the Swift for Arduino compiler, that builds Swift onto Arduino Uno boards, compatibles and other AVR microcontrollers. We have our own fork. We upstream occasionally, when it’s 100% a real bug-fix but mostly it’s just quirky stuff like this for us… as @arsenm said… pretty niche stuff! We started with a super limited language and have slowly and carefully been adding features into it from the main language. It seems this particular code is needed for the function pointers used inside closures to work correctly! (but I digress… this is the llvm forum, not the swift forum!) :slight_smile:

Sorry to ask for hand holding, I’m a bit new to wider llvm code!

I’m looking at this fairly simple patch, basically restoring the old IR code in most cases, then hard coding program address space when it’s a function type.

    /// Add a pointer to the given type as the next parameter.
    void addPointerParameter(llvm::Type *storageType) {
      if (isa<llvm::FunctionType>(storageType)) {
        // this is a temporary hack, we should read the program data space
        // from the target data layout
      } else {

Does that look OK or will it blow up on me?

I had planned to add a virtual method to llvm::Type like unsigned getPointerAddressSpaceOfPointerTo(), provide a base implementation that returns 0, then override it in llvm::FunctionType to return the program address space. But I hadn’t realised that llvm::Type and its subclasses don’t seem to have any virtual methods currently. So when I added one, loads of warnings appeared about “not defining a virtual destructor in a virtual class”, which made me think I was disturbing the structure of current llvm IR code too much with this hack, so I went back to the above, simpler approach. Does that sound OK?

I’m wary of relying on the pointee type to determine an address space. In any case this isn’t a common / generic enough use case to push into the base classes

Yeah, the more I think of it, the more I’m sure this doesn’t warrant any change in llvm. I think the change has to be confined to the calling code (the swift compiler’s IRGen sub unit). The one improvement I’ll make when time permits is at least in that spot, using the data layout rather than hard coded constants.