Is address space 1 reserved?

On the review for , commented that:

I’m not aware of any such restriction, and I know of several LLVM based systems that use address space 1 for something other than that.

-Owen

On the review for , commented that:

First i’ve heard of it…

The only address spaces with special meanings I know of are:

  • 0 (the normal address space, null is not dereferencable)
  • 256 - TLS, GS relative addressing
  • 257 - FS relative addressing

I didn’t even know 256/257 had special meanings. I thought they were only used by x86. It would be good to clarify them too just incase other targets ever wanted to use them.

Thanks,
Pete

Sorry, let me clarify. To my knowledge, 256/257 are only reserved on x86.

I was repeating something that Nick told me a while ago, he mentioned it again on the list: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150105/251080.html

Nick asked for it to get documented but it looks like there is disagreement as to whether or not it really does (or perhaps should) exist.

Sorry, let me clarify. To my knowledge, 256/257 are only reserved on x86.

Ah cool. Thanks.

Pete

I’m not aware of any such restriction, and I know of several LLVM based systems that use address space 1 for something other than that.

-Owen

Yes, this would be a problem for us. We use 1 for a normal address space where 0 is invalid. However, we also have a problem where some other address spaces do want 0 to be a valid address, which just sort of don’t work correctly now.

-Matt

If you have an example with a null in a non-0 address space being mishandled, please file a bug. We’ll fix them as we find them.

I'm not aware of any such restriction, and I know of several LLVM based
systems that use address space 1 for something other than that.

Oof. It was discussed when the patches to add addrspace were being
considered, and this is why we should've written it down.

It would be nice to have an addrspace that does mean "same as addrspace(0)
except that null may be dereferenceable", and to attach that to
-fno-delete-null-pointer-checks. Any ideas for what that addrspace should
be?

If you have an example with a null in a non-0 address space being mishandled, please file a bug. We’ll fix them as we find them.

I think the problems aren’t so much that accessing 0 doesn’t work (although I imagine there are problems with that), but expectations of comparison with null. The main problem I’m aware of is comparisons with null pointers. The first global object in addrspace(3) will have the address of 0, so if a user does if (x != NULL), it will not behave as expected. For C I think this is supposed to be fixed by changing the value of NULL to -1, but I don’t think that is currently supported. That is also complicated because the null value is different for different address spaces, and I think the actual null pointer value must be 0 for C++. It doesn’t really turn up often in real code so I don’t think anybody has really spent time thinking about how to properly solve this.

-Matt

Just to make sure I’m interpreting this right: the problem is essentially that we hard code “null” to mean address 0 in all address spaces? If we allowed the numeric value of null to be configurable per address space, would that resolve the issue at the LLVM IR level? Solving the frontend/language spec problem seems out of scope for LLVM, though probably not for clang. Can you point me to a usage of C++ with non-zero address spaces? I’d be curious to know what’s happening in this space.

It is very hard in principle to reserve an address space for target independent use once there are already clients defining their own mappings, because you risk breaking their existing bitcode files.

In practice, you could probably get away with it if you pick a stupidly high address space number.

-Owen

If you have an example with a null in a non-0 address space being mishandled, please file a bug. We’ll fix them as we find them.

I think the problems aren’t so much that accessing 0 doesn’t work (although I imagine there are problems with that), but expectations of comparison with null. The main problem I’m aware of is comparisons with null pointers. The first global object in addrspace(3) will have the address of 0, so if a user does if (x != NULL), it will not behave as expected. For C I think this is supposed to be fixed by changing the value of NULL to -1, but I don’t think that is currently supported. That is also complicated because the null value is different for different address spaces, and I think the actual null pointer value must be 0 for C++. It doesn’t really turn up often in real code so I don’t think anybody has really spent time thinking about how to properly solve this.

Actually, we had a similar discussion a while ago about this: http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-August/064624.html

In the link I gave, I proposed using global metadata to describe address spaces. Its useful, for example, to know that an address space is always to constant memory, i.e., the CL model.

I think later in the conversation we also thought about defining the relationships between address spaces in a similar method to tbaa on types. Then you could do address space AA.

Pete

From what I can tell, that flag combines two semantics: 1) An object can live at null, null is dereferencable. 2) Null checks should not be eliminated base on a previous dereference of that location. I suspect most usage is for the second, probably for security mitigation in legacy programs. Using an address space seems like a great solution for a use case which actually requires both, but I’d be tempted to try something else for security mitigation. I wonder how far we could get performance wise by null checking every access and then optimizing them away? This is strictly stronger. Returning from that tangent, I’m not arguing against the reservation of such an address space. Since we seem to have precedent for larger numbers meaning reserved things, why don’t we document the following? 0 - current meaning, default 1-127 - reserved for custom extensions, no defined special semantics in upstream 128 - like address space zero, but null is dereferenceable 129-255 - reserved for future generic usage 256+ - reserved for generic target specific address spaces (This is strictly a strawman proposal; I have no attachment to this suggestion.)

Just to make sure I’m interpreting this right: the problem is essentially that we hard code “null” to mean address 0 in all address spaces? If we allowed the numeric value of null to be configurable per address space, would that resolve the issue at the LLVM IR level?

Yes, it would. I’ve always imagined this to a be a large undertaking though

Solving the frontend/language spec problem seems out of scope for LLVM, though probably not for clang. Can you point me to a usage of C++ with non-zero address spaces? I’d be curious to know what’s happening in this space.

There’s an AMD static C++ extension language, and a khronos draft for an OpenCL C++ kernel language which applies the same sort of restrictions and address spaces to C++ as OpenCL C has. Last time I looked at this I remember that C allows a non-zero null pointer value, but 0 must be implicitly converted to the correct null pointer value and the NULL macro will expand to this integer value. I don’t think the OpenCL C spec touches the issue of different NULL values for different address spaces, but does explicitly allow different sized pointers for each. I am less clear on what C++ requires, but C++11 4.10 says “A null pointer constant is an integral constant expression (5.19) prvalue of integer type that evaluates to zero or a prvalue of type std::nullptr_t.”

I’d agree on the scope, but it also seems fairly straight forward. If this becomes a serious issue, this seems like a workable approach.

I'm not aware of any such restriction, and I know of several LLVM based
systems that use address space 1 for something other than that.

-Owen

Yes, this would be a problem for us. We use 1 for a normal address space
where 0 is invalid. However, we also have a problem where some other
address spaces do want 0 to be a valid address, which just sort of don’t
work correctly now.

If you have an example with a null in a non-0 address space being
mishandled, please file a bug. We'll fix them as we find them.

I think the problems aren’t so much that accessing 0 doesn’t work
(although I imagine there are problems with that), but expectations of
comparison with null. The main problem I’m aware of is comparisons with
null pointers. The first global object in addrspace(3) will have the
address of 0, so if a user does if (x != NULL), it will not behave as
expected. For C I think this is supposed to be fixed by changing the value
of NULL to -1, but I don’t think that is currently supported. That is also
complicated because the null value is different for different address
spaces, and I think the actual null pointer value must be 0 for C++. It
doesn’t really turn up often in real code so I don’t think anybody has
really spent time thinking about how to properly solve this.

Just to make sure I'm interpreting this right: the problem is essentially
that we hard code "null" to mean address 0 in all address spaces? If we
allowed the numeric value of null to be configurable per address space,
would that resolve the issue at the LLVM IR level?

Solving the frontend/language spec problem seems out of scope for LLVM,
though probably not for clang. Can you point me to a usage of C++ with
non-zero address spaces? I'd be curious to know what's happening in this
space.

I know of GPU-targeting compilers that use C++ as a frontend language. I
unfortunately don't have anything to point you to though.

-- Sean Silva

I’m a bit hesitant* to do this with metadata. At least to start with, these seem like backend specific properties. Why not introduce some hooks into Target or Subtarget with the appropriate queries? * Reasons for hesitancy: - Not sure these are purely optimizations - is dropping always legal? - How do we merge such things in LTO? - Forward serialization? It might be better to define the properties better than design a reasonable scheme.

I’m a bit hesitant* to do this with metadata. At least to start with, these seem like backend specific properties. Why not introduce some hooks into Target or Subtarget with the appropriate queries? * Reasons for hesitancy: - Not sure these are purely optimizations - is dropping always legal?

It would be global metadata so can’t be dropped (or just isn’t right now so its ok anyway)

  • How do we merge such things in LTO?

Thats a good point.

  • Forward serialization? It might be better to define the properties better than design a reasonable scheme.

As is that.

I think at the time I proposed metadata we didn’t have TTI or anything else similar. I would be happy to say that things like null ptr deref are defined only for address space 0, and all other address spaces can only be optimised if TTI supports it. This means no TTI would default to not optimizing anything other than address space 0 which I think is good.

You could also move all of the checks to TTI and define that NoTTI gives an answer for address space 0 and ignores all others. Then you can just query TTI everywhere instead of special casing address space 0 everywhere.

Pete

Either approach sounds fine; I have no opinion. Volunteers? :slight_smile: