32bit pointers on a (pure) 64bit architecture

Hi,

I am trying to get llvm working for an architecture that has 64bit registers, but 32bit addresses.

Because of that, I want the pointers to also be 32bit, although they will live in a 64 bit register.

On the frontend, I do not encounter any issues, but when I provide a

“p:32:32:32”

DataLayout specification to the backend, things get ugly:

  • SelectionDag is producing a mix of 64bit and 32bit nodes, and it seems that a number of
    necessary legalizations/promotions are missing.

After adding support for ‘PromoteIntRes_GlobalAddress(…)’, I get a failure, because a ‘store’ node now has a mix of operands:

  • some of the operands were already ‘i64’ from the beginning,

  • others were ‘i32’ (due to the 32bit pointers) and have been promoted to ‘i64’

Because of the latter, the ‘PromoteIntOp_STORE’ is called. This uses ‘GetPromotedInteger’ to access the operands.

But, GetPromotedInteger fails if the operand was not promoted. So it fails on the operands of the ‘store’ that were already legal (i64).

What would be the recommended way to fix this ?

(Or did I miss something, and should 32bit pointers on a 64bit architecture be done in a completely different way)

Greetings,

Jeroen Dobbelaere

From: "Jeroen Dobbelaere" <jeroen.dobbelaere@gmail.com>
To: LLVMdev@cs.uiuc.edu
Sent: Thursday, April 3, 2014 5:31:13 PM
Subject: [LLVMdev] 32bit pointers on a (pure) 64bit architecture

Hi,

I am trying to get llvm working for an architecture that has 64bit
registers, but 32bit addresses.
Because of that, I want the pointers to also be 32bit, although they
will live in a 64 bit register.

On the frontend, I do not encounter any issues, but when I provide a
...
"p:32:32:32"
...

DataLayout specification to the backend, things get ugly:

- SelectionDag is producing a mix of 64bit and 32bit nodes, and it
seems that a number of
necessary legalizations/promotions are missing.

I don't understand why you're doing this. If the pointers live in 64-bit registers, why don't you just consider them to be 64-bit pointers? The fact that the upper 32 bits will always be zero does not seem like something you'd need to worry about (although there are certainly some optimizations that can be done later). Frontend issues, like the fact that ptrdiff_t is only 32 bits, seem like an independent concern.

-Hal

How are you implementing GlobalAddress? R600/SI has both 32-bit and 64-bit pointers, and (sort of) 64-bit registers if you want to see how it handles it.

Hi Hal,

> From: "Jeroen Dobbelaere" <jeroen.dobbelaere@gmail.com>
[... ]

I am trying to get llvm working for an architecture that has 64bit
> registers, but 32bit addresses.
> Because of that, I want the pointers to also be 32bit, although they
> will live in a 64 bit register.
>
[...]

I don't understand why you're doing this. If the pointers live in 64-bit

registers, why don't you just consider them to be 64-bit pointers? The fact
that the upper 32 bits will always be zero does not seem like something
you'd need to worry about (although there are certainly some optimizations
that can be done later). Frontend issues, like the fact that ptrdiff_t is
only 32 bits, seem like an independent concern.

-Hal

The main reason to do this, is to use less space for a pointer. This
reflects then in the size of structs that use pointers etc...

As the layout of a structure is based on the DataLayout string, Both the
DataLayout of clang and of the backend need to be kept in sync:
- When I provide 32bit pointers to clang, but 64bit pointers to the
backend, the frontend and middle end are indeed making use of 32bit
pointers. But from the moment that a pointer member needs to be accessed
(by the backend), the offset is wrong (as it suddenly assumes 64bit
pointers).

(Like in 'struct Foo { int bar; struct Foo* next; struct Foo* prev; } )

So, that's why I believe the the right way to handle this, is by specifying
"p:32:32:32" in both datalayout strings.
The fact that because of this, llvm is producing a selection dag that
cannot be handled by legalization seems more like a bug to me.. and I would
like to find out what the recommend way is to fix this.
[...]

Greetings,

Jeroen Dobbelaere

Presumably pointers are also only 32 bits in memory. I'm not quite sure what the issue here is though, because the DataLayout string just defines in-memory representations. In particular, it *doesn't* differentiate between the size and the range of a pointer[1].

When you get to the back end, you can just do zero extension of pointer loads. This is slightly tricky, because SelectionDAG is very keen on throwing away the fact that something is a pointer, but the same type legalisation that works for unsigned 32-bit integers should apply to pointers...

David

[1] I have some patches that do this, for fat pointer support.

From: "Jeroen Dobbelaere" <jeroen.dobbelaere@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: LLVMdev@cs.uiuc.edu
Sent: Friday, April 4, 2014 3:54:08 AM
Subject: Re: [LLVMdev] 32bit pointers on a (pure) 64bit architecture

Hi Hal,

> From: "Jeroen Dobbelaere" < jeroen.dobbelaere@gmail.com >
[... ]

> I am trying to get llvm working for an architecture that has 64bit
> registers, but 32bit addresses.
> Because of that, I want the pointers to also be 32bit, although
> they
> will live in a 64 bit register.
>
[...]

I don't understand why you're doing this. If the pointers live in
64-bit registers, why don't you just consider them to be 64-bit
pointers? The fact that the upper 32 bits will always be zero does
not seem like something you'd need to worry about (although there
are certainly some optimizations that can be done later). Frontend
issues, like the fact that ptrdiff_t is only 32 bits, seem like an
independent concern.

-Hal

The main reason to do this, is to use less space for a pointer. This
reflects then in the size of structs that use pointers etc...

As the layout of a structure is based on the DataLayout string, Both
the DataLayout of clang and of the backend need to be kept in sync:
- When I provide 32bit pointers to clang, but 64bit pointers to the
backend, the frontend and middle end are indeed making use of 32bit
pointers. But from the moment that a pointer member needs to be
accessed (by the backend), the offset is wrong (as it suddenly
assumes 64bit pointers).

(Like in 'struct Foo { int bar; struct Foo* next; struct Foo* prev; }
)

So, that's why I believe the the right way to handle this, is by
specifying "p:32:32:32" in both datalayout strings.

Agreed; but what does your *ISelLowering::getPointerTy() function look like? Did you override the default implementation? I wonder if returning MVT::i64 from this function will make things work for you.

-Hal

Hi Hal,

From: "Jeroen Dobbelaere" <jeroen.dobbelaere@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: LLVMdev@cs.uiuc.edu
Sent: Friday, April 4, 2014 10:07:07 AM
Subject: Re: [LLVMdev] 32bit pointers on a (pure) 64bit architecture

Hi Hal,

[..]

Agreed; but what does your *ISelLowering::getPointerTy() function
look like? Did you override the default implementation? I wonder if
returning MVT::i64 from this function will make things work for you.

-Hal

I actually missed implementing this one. But, depending on the
testcase that I use,

implementing it has no or worse effect:

I now get assertion errors complaining of different value sizes,
coming from the

difference between the datalayout (ptrsize:i32) and the pointerty
(size overruled to i64).

in SelectionDag::InferPtrAlignment.

I also found out the this method is not called at all (for my
(failing) tests) during the initial selection dag buildup.

Note:
- most of my testing is done on llvm-3.3, but I verified that the
same issues are there for llvm-3.4

- I did implement:
virtual MVT getScalarShiftAmountTy(EVT LHSTy) const { return
MVT::i64; }

although I have no good idea why the default for that is mapped to
the pointer type...

In the following sample:
- if full 64bit pointers are used, XXXi32 will be 'i64' and
everything legalizes/compiles fine
- if 32bit pointers are used, XXXi32 will be 'i32' and legalization
fails.

*** IR Dump After Module Verifier ***
; Function Attrs: nounwind
define void @foo(i32 %lhs) #0 {
entry:
store i32 %lhs, i32* @g_bar, align 4, !tbaa !0
ret void
}
Computing probabilities for entry

Initial selection DAG: BB#0 'foo:entry'
SelectionDAG has 9 nodes:
0xbb93e8c: ch = EntryToken [ORD=1]
0xbbb8428: XXXi32 = Constant<0>
0xbb93e8c: <multiple use>
0xbb93e8c: <multiple use>
0xbbb8208: i64 = Register %vreg0 [ORD=1]
0xbbb8290: i64,ch = CopyFromReg 0xbb93e8c, 0xbbb8208 [ORD=1]
0xbbb8318: i32 = truncate 0xbbb8290 [ORD=1]
0xbbb83a0: XXXi32 = GlobalAddress<i32* @g_bar> 0 [ORD=1]
0xbbb84b0: XXXi32 = undef [ORD=1]
0xbbb8538: ch = store 0xbb93e8c, 0xbbb8318, 0xbbb83a0,
0xbbb84b0<ST4[@g_bar](tbaa=!"int")> [ORD=1]
0xbbb85c0: ch = PDISD::NODE_RET_FLAG 0xbbb8538
-----

Is this difference indeed what we would need to expect at this level
?
- should we learn the selection dag creator to use the first larger
or equal valid type (i64) here ?

- or should we make sure that the legalization phase accepts the mix
of types that is produced in that case ?

Is i32 a legal type? In the PowerPC backend, in 64-bit mode, we have 32-bit subregisters of the 64-bit registers, and both i32 and i64 are legal types. I suspect that if you do something similar, things will work for you.

-Hal

On this architecture, i32 is _not_ a legal type. I can take a look at that,
but I was trying to avoid it :(.

Greetings,

From: "Jeroen Dobbelaere" <jeroen.dobbelaere@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: LLVMdev@cs.uiuc.edu
Sent: Friday, April 4, 2014 10:37:13 AM
Subject: Re: [LLVMdev] 32bit pointers on a (pure) 64bit architecture

[..]

> Is this difference indeed what we would need to expect at this
> level
> ?
> - should we learn the selection dag creator to use the first larger
> or equal valid type (i64) here ?
>
> - or should we make sure that the legalization phase accepts the
> mix
> of types that is produced in that case ?

Is i32 a legal type? In the PowerPC backend, in 64-bit mode, we have
32-bit subregisters of the 64-bit registers, and both i32 and i64
are legal types. I suspect that if you do something similar, things
will work for you.

-Hal

On this architecture, i32 is _not_ a legal type. I can take a look at
that, but I was trying to avoid it :(.

Fair enough. I think that, although it will involve some extraneous messiness in your register definitions, it should not actually be that bad. You'll want to set most of the operation actions to Promote in your *ISelLowering.cpp file, promoting them to i64. There are probably some hidden assumptions that ADD is always available for all legal types, but that's not a huge problem because, like AND, OR, etc. ADD on the 32-bit 'sub-regsters' can be done with a pattern using the 64-bit instruction (again, the PowerPC backend does exactly this).

All of this, however, does come with some advantages. Because the DAGCombiner and later passes will "understand" the 32-bit nature of the pointer values, it will be able to fold away some masking operations without you needing to do anything special.

-Hal

PNaCl does exactly this: all its targets are le32, including x86-64. Intel also had a patch (which didn’t make it into LLVM) which added x32 to LLVM.

You may want to look at both of these.