Allowing virtual registers after register allocation

Hi all,
Virtual ISAs such as WebAssembly and NVPTX use infinite virtual register sets instead of traditional phsyical registers. PrologEpilogInserter is run after register allocation and asserts that all virtuals have been allocated but doesn’t otherwise depend on this if scavenging is not needed. We’d like to use the target-independent PEI code for WebAssembly, so we’re proposing a TargetRegisterInfo hook for targets to indicate that they use virtual registers in this way (currently called usesVirtualRegstersAfterRegAlloc(), other suggestions welcome). The code is athttp://reviews.llvm.org/D15394 and an example of the intended use for WebAssembly is at http://reviews.llvm.org/D15344 .

The actual change to PrologEpilogInserter itself is quite minimal, but we thought we’d ask a wider audience for feedback since it’s a target-independent change. For WebAssembly we would implement prolog/epilog insertion and FrameIndex elimination but most of the rest of the PEI code (dealing with callee-saved registers, scavenging) does nothing.

For other reference the NVPTX backend currently disables the PrologEpilogInserter pass but has its own pass which is just a copy (inevitably slightly out-of-date) of PEI with the irrelevant bits just deleted; it could probably be updated to use this mechanism too.

Any comments?
Thanks,
-Derek

From: "Derek Schuff via llvm-dev" <llvm-dev@lists.llvm.org>
To: llvm-dev@lists.llvm.org
Sent: Wednesday, December 9, 2015 4:31:31 PM
Subject: [llvm-dev] Allowing virtual registers after register allocation

Hi all,
Virtual ISAs such as WebAssembly and NVPTX use infinite virtual
register sets instead of traditional phsyical registers.
PrologEpilogInserter is run after register allocation and asserts
that all virtuals have been allocated but doesn't otherwise depend
on this if scavenging is not needed. We'd like to use the
target-independent PEI code for WebAssembly, so we're proposing a
TargetRegisterInfo hook for targets to indicate that they use
virtual registers in this way (currently called
usesVirtualRegstersAfterRegAlloc(), other suggestions welcome). The
code is at http://reviews.llvm.org/D15394 and an example of the
intended use for WebAssembly is at http://reviews.llvm.org/D15344 .

I think this makes sense, and generally speaking, I think it will be good for us to have better support for VM targets without fixed-sized register sets.

Bikeshedding: usesVirtualRegstersAfterRegAlloc() - Are you actually "allocating" virtual registers, or just using the ones that the infrastructure already provides? Is the answer the same for the NVPTX backend? Maybe something like: targetLacksPhysicalRegissters() would be better?

The actual change to PrologEpilogInserter itself is quite minimal,
but we thought we'd ask a wider audience for feedback since it's a
target-independent change. For WebAssembly we would implement
prolog/epilog insertion and FrameIndex elimination but most of the
rest of the PEI code (dealing with callee-saved registers,
scavenging) does nothing.

For other reference the NVPTX backend currently disables the
PrologEpilogInserter pass but has its own pass which is just a copy
(inevitably slightly out-of-date) of PEI with the irrelevant bits
just deleted; it could probably be updated to use this mechanism
too.

That sounds good improvement.

-Hal

From: “Derek Schuff via llvm-dev” <llvm-dev@lists.llvm.org>
To: llvm-dev@lists.llvm.org
Sent: Wednesday, December 9, 2015 4:31:31 PM
Subject: [llvm-dev] Allowing virtual registers after register allocation

Hi all,
Virtual ISAs such as WebAssembly and NVPTX use infinite virtual
register sets instead of traditional phsyical registers.
PrologEpilogInserter is run after register allocation and asserts
that all virtuals have been allocated but doesn’t otherwise depend
on this if scavenging is not needed. We’d like to use the
target-independent PEI code for WebAssembly, so we’re proposing a
TargetRegisterInfo hook for targets to indicate that they use
virtual registers in this way (currently called
usesVirtualRegstersAfterRegAlloc(), other suggestions welcome). The
code is at http://reviews.llvm.org/D15394 and an example of the
intended use for WebAssembly is at http://reviews.llvm.org/D15344 .

I think this makes sense, and generally speaking, I think it will be good for us to have better support for VM targets without fixed-sized register sets.

Bikeshedding: usesVirtualRegstersAfterRegAlloc() - Are you actually “allocating” virtual registers, or just using the ones that the infrastructure already provides?

Not exactly; the actual register allocation does nothing (i.e. WebAssemblyPassConfig::createTargetRegisterAllocator() returns nullptr) and we just use the regular infrastructure virtual registers. However we do run a custom register coloring pass which reduces the total number of virtual registers used.

Is the answer the same for the NVPTX backend?

Yes (at least, they have a null TargetRegisterAllocator too).

Maybe something like: targetLacksPhysicalRegissters() would be better?

Maybe. We actually do have “physical” registers called SP and FP (returned by TargetRegisterInfo::getFrameRegister() and used by some default ISel lowerings and by FrameIndex elimination) but of course they aren’t really physical registers either.

Hi,

I would actually go the other direction, i.e., stick to physical registers but with an infinite number.
The rational is that after register allocation we broke all the nice properties of the pre-alloc virtual registers. For instance, the existing liveness algorithm cannot be used on those virtual registers. On the other hand, all the infrastructure we have in place for physical registers would be suited.

Cheers,
Q.

Hi,

I would actually go the other direction, i.e., stick to physical registers but with an infinite number.
The rational is that after register allocation we broke all the nice properties of the pre-alloc virtual registers. For instance, the existing liveness algorithm cannot be used on those virtual registers. On the other hand, all the infrastructure we have in place for physical registers would be suited.

(modulo supporting a dynamic number of physical registers)

From: "Quentin Colombet" <qcolombet@apple.com>
To: "Derek Schuff" <dschuff@google.com>
Cc: "Hal Finkel" <hfinkel@anl.gov>, llvm-dev@lists.llvm.org
Sent: Wednesday, December 9, 2015 6:14:33 PM
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

Hi,

I would actually go the other direction, i.e., stick to physical
registers but with an infinite number.
The rational is that after register allocation we broke all the nice
properties of the pre-alloc virtual registers. For instance, the
existing liveness algorithm cannot be used on those virtual
registers.

Why? Is this just related to the fact that we've dropped them out of SSA form?

On the other hand, all the infrastructure we have in
place for physical registers would be suited.

(modulo supporting a dynamic number of physical registers)

But there is lots of code that assumes that it can iterate over all physical registers in some class. My thought had been that you don't want to introduce infinite physical register sets because this assumption of enumerability is broken (as is the assumption that the size does not dynamically change). Thoughts?

-Hal

The post-RA code may not be free from assumptions that all virtual registers are gone. For example, such code may not handle subregisters, since PhysReg:Sub can always be collapsed to another physical register. For virtual registers, there is no such guarantee.

I think it would be a lot clearer if we introduced infinite register classes, with all the properties of physical registers (except those obviously related to finiteness). Having virtual registers after RA sounds like a huge hack.

-Krzysztof

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
Krzysztof Parzyszek via llvm-dev
Sent: Thursday, December 10, 2015 9:47 AM
To: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

But there is lots of code that assumes that it can iterate over all physical

registers in some class. My thought had been that you don't want to
introduce infinite physical register sets because this assumption of
enumerability is broken (as is the assumption that the size does not
dynamically change). Thoughts?

The post-RA code may not be free from assumptions that all virtual
registers are gone. For example, such code may not handle subregisters,
since PhysReg:Sub can always be collapsed to another physical register.
For virtual registers, there is no such guarantee.

I think it would be a lot clearer if we introduced infinite register
classes, with all the properties of physical registers (except those
obviously related to finiteness). Having virtual registers after RA
sounds like a huge hack.

I definitely agree that having virtual regs after RA sounds like a hack.

But I also don't know why it would be desirable to introduce infinite register classes. The WebAsm folks are already saying that they would like to do register allocation to target a fixed/limited number (might be large though) of "virtual registers". So, instead of calling these virtual registers, why not call them physical registers, and have a fixed number of them, that corresponds to the number that is desired to allocate to. Or, you could have the number of registers to use be run-time selectable in some manner by just having the physical register set be larger than is ever planned to be used by that particular CG, and having run-time controls to restrict the set of allocatable physical registers.

Kevin Smith

From: “Quentin Colombet” <qcolombet@apple.com>
To: “Derek Schuff” <dschuff@google.com>
Cc: “Hal Finkel” <hfinkel@anl.gov>, llvm-dev@lists.llvm.org
Sent: Wednesday, December 9, 2015 6:14:33 PM
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

Hi,

I would actually go the other direction, i.e., stick to physical
registers but with an infinite number.
The rational is that after register allocation we broke all the nice
properties of the pre-alloc virtual registers. For instance, the
existing liveness algorithm cannot be used on those virtual
registers.

Why? Is this just related to the fact that we’ve dropped them out of SSA form?

The rough answer is yes.
The more precise answer is that virtual registers are supported out of SSA form, but if you want to use the liveness infrastructure for them (I mean the LiveInterval class and such), you need to maintain the SSA form behind the scene with the VNIs and other related objects. This is usually complicated and error prone. Moreover recomputing that information from scratch in the out of SSA form is not supported and is basically the same as reconstructing SSA (+implict def and fun for sub registers) and that does not seem worth to me.

On the other hand, all the infrastructure we have in
place for physical registers would be suited.

(modulo supporting a dynamic number of physical registers)

But there is lots of code that assumes that it can iterate over all physical registers in some class. My thought had been that you don’t want to introduce infinite physical register sets because this assumption of enumerability is broken (as is the assumption that the size does not dynamically change). Thoughts?

That is a good point and I imagine that the solution of such problems depends on what the related algorithms are trying to do.
My thought for enumeration, but again may not be applicable for every cases, was that we stick to the number of physical registers we currently need. The fact that the set is infinite means that you can make it grow as much as you need, but at a given time it has a finite size.

Honestly, I think that we could get the proper answer only when we see the actual problems.
Moreover, if we want to stick to virtual registers, what is the point in trying to run post-RA passes anyway?

In other words, do we actually have to go in either directions: support “infinite” phys reg or add virtual reg support in post-RA passes?

I am tempted to think no, we don’t, but I don’t know the use cases.
What post-RA passes with want to run with virtual regs?

Cheers,
-Quentin

The immediate one that precipitated this mail was PrologEpilogInserter.
However currently the only other pass we have disabled in WebAssemblyTargetMachine is MachineCopyPropagation.
Several passes (post-RA MachineLICM, StackSlotColoring) already only run if RA runs.
Everything else is running today. Currently that’s ShrinkWrap, BranchFolder, ExpandPostRAPseudos, PostRAScheduler, GCMachineCodeAnalysis, MachineBlockPlacement, FuncletLayout, and StackMapLiveness. All of these run after our register coloring pass.

I don’t know for the other passes, but I don’t think it makes sense to teach PrologEpilogInserter to work on virtual registers, since part of its job is to get rid of any virtual registers created when lowering the frame.
I.e., that would indirectly mean that we would need to teach the scavenger how to recycle virtual registers!

The bottom line is, it feels wrong to me.

Cheers,
Q.

From: "Quentin Colombet" <qcolombet@apple.com>
To: "Derek Schuff" <dschuff@google.com>
Cc: "Hal Finkel" <hfinkel@anl.gov>, llvm-dev@lists.llvm.org
Sent: Thursday, December 10, 2015 1:11:19 PM
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

I am tempted to think no, we don’t, but I don’t know the use cases.
What post-RA passes with want to run with virtual regs?

The immediate one that precipitated this mail was
PrologEpilogInserter.
However currently the only other pass we have disabled in
WebAssemblyTargetMachine is MachineCopyPropagation.
Several passes (post-RA MachineLICM, StackSlotColoring) already only
run if RA runs.
Everything else is running today. Currently that's ShrinkWrap,
BranchFolder, ExpandPostRAPseudos, PostRAScheduler,
GCMachineCodeAnalysis, MachineBlockPlacement, FuncletLayout, and
StackMapLiveness. All of these run after our register coloring pass.

I don’t know for the other passes, but I don’t think it makes sense
to teach PrologEpilogInserter to work on virtual registers, since
part of its job is to get rid of any virtual registers created when
lowering the frame.
I.e., that would indirectly mean that we would need to teach the
scavenger how to recycle virtual registers!

I think this is exactly the part of PEI that they disable.

-Hal

Yes; see http://reviews.llvm.org/D15394
If the target has no callee-saved registers and you disable scavenging, you’re basically left with… prolog/epilog insertion and FrameIndex elimination.

I can’t shake the feeling that this is a hack.
After RA we shouldn’t have virtual registers anymore. I would even be in favor for a change in the verifier to complain about that.

My concern, aside the semantic aspect, is that we will have such patches all over the place and that the coverage will be very low.

Q.

From: "Kevin B via llvm-dev Smith" <llvm-dev@lists.llvm.org>
To: "Krzysztof Parzyszek" <kparzysz@codeaurora.org>, llvm-dev@lists.llvm.org
Sent: Thursday, December 10, 2015 12:04:49 PM
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

>From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
>Krzysztof Parzyszek via llvm-dev
>Sent: Thursday, December 10, 2015 9:47 AM
>To: llvm-dev@lists.llvm.org
>Subject: Re: [llvm-dev] Allowing virtual registers after register
>allocation
>
>>
>> But there is lots of code that assumes that it can iterate over
>> all physical
>registers in some class. My thought had been that you don't want to
>introduce infinite physical register sets because this assumption of
>enumerability is broken (as is the assumption that the size does not
>dynamically change). Thoughts?
>
>The post-RA code may not be free from assumptions that all virtual
>registers are gone. For example, such code may not handle
>subregisters,
>since PhysReg:Sub can always be collapsed to another physical
>register.
> For virtual registers, there is no such guarantee.
>
>I think it would be a lot clearer if we introduced infinite register
>classes, with all the properties of physical registers (except those
>obviously related to finiteness). Having virtual registers after RA
>sounds like a huge hack.

I definitely agree that having virtual regs after RA sounds like a
hack.

But I also don't know why it would be desirable to introduce infinite
register classes. The WebAsm folks are already saying that they
would like to do register allocation to target a fixed/limited
number (might be large though) of "virtual registers". So, instead
of calling these virtual registers, why not call them physical
registers, and have a fixed number of them, that corresponds to the
number that is desired to allocate to. Or, you could have the
number of registers to use be run-time selectable in some manner by
just having the physical register set be larger than is ever planned
to be used by that particular CG, and having run-time controls to
restrict the set of allocatable physical registers.

You're right that we don't really want infinite register classes, but rather, we want "expandable" ones. Making extra-large register classes that are restricted by having most of the registers in the reserved set, however, seems just as much a hack (and a worse one in many ways).

I'd certainly not object to some kind of dynamically-sized "physical" register class concept.

-Hal

From: Hal Finkel [mailto:hfinkel@anl.gov]
Sent: Thursday, December 10, 2015 12:11 PM
To: Smith, Kevin B <kevin.b.smith@intel.com>
Cc: Krzysztof Parzyszek <kparzysz@codeaurora.org>; llvm-
dev@lists.llvm.org
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

From: "Kevin B via llvm-dev Smith" <llvm-dev@lists.llvm.org>
To: "Krzysztof Parzyszek" <kparzysz@codeaurora.org>, llvm-

dev@lists.llvm.org

Sent: Thursday, December 10, 2015 12:04:49 PM
Subject: Re: [llvm-dev] Allowing virtual registers after register allocation

>From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of
>Krzysztof Parzyszek via llvm-dev
>Sent: Thursday, December 10, 2015 9:47 AM
>To: llvm-dev@lists.llvm.org
>Subject: Re: [llvm-dev] Allowing virtual registers after register
>allocation
>
>>
>> But there is lots of code that assumes that it can iterate over
>> all physical
>registers in some class. My thought had been that you don't want to
>introduce infinite physical register sets because this assumption of
>enumerability is broken (as is the assumption that the size does not
>dynamically change). Thoughts?
>
>The post-RA code may not be free from assumptions that all virtual
>registers are gone. For example, such code may not handle
>subregisters,
>since PhysReg:Sub can always be collapsed to another physical
>register.
> For virtual registers, there is no such guarantee.
>
>I think it would be a lot clearer if we introduced infinite register
>classes, with all the properties of physical registers (except those
>obviously related to finiteness). Having virtual registers after RA
>sounds like a huge hack.

I definitely agree that having virtual regs after RA sounds like a
hack.

But I also don't know why it would be desirable to introduce infinite
register classes. The WebAsm folks are already saying that they
would like to do register allocation to target a fixed/limited
number (might be large though) of "virtual registers". So, instead
of calling these virtual registers, why not call them physical
registers, and have a fixed number of them, that corresponds to the
number that is desired to allocate to. Or, you could have the
number of registers to use be run-time selectable in some manner by
just having the physical register set be larger than is ever planned
to be used by that particular CG, and having run-time controls to
restrict the set of allocatable physical registers.

You're right that we don't really want infinite register classes, but rather, we
want "expandable" ones. Making extra-large register classes that are
restricted by having most of the registers in the reserved set, however,
seems just as much a hack (and a worse one in many ways).

Whether it’s a hack or not depends on the sizes in question. Existing X86 already has this property for 64 bit, there are registers which simply don't exist
unless the target arch is 64 bit. If WebASM folks are thinking of allocating down to something like 32 or 64 registers, with maybe a maximum of 128 or 256, then
making some portion of this reserved when a tighter allocation (only coloring to 16 or 32) seems completely doable (and natural) using all existing infrastructure, with nothing
special needed. If getting into significantly larger numbers, then I can see where this might be considered a hack. But unless you are talking about multi-thousands,
it does beg the question about what the extra generality is worth compared to the engineering effort to design, implement and support it.

From: "Kevin B Smith" <kevin.b.smith@intel.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Krzysztof Parzyszek" <kparzysz@codeaurora.org>, llvm-dev@lists.llvm.org
Sent: Thursday, December 10, 2015 2:32:36 PM
Subject: RE: [llvm-dev] Allowing virtual registers after register allocation

>From: Hal Finkel [mailto:hfinkel@anl.gov]
>Sent: Thursday, December 10, 2015 12:11 PM
>To: Smith, Kevin B <kevin.b.smith@intel.com>
>Cc: Krzysztof Parzyszek <kparzysz@codeaurora.org>; llvm-
>dev@lists.llvm.org
>Subject: Re: [llvm-dev] Allowing virtual registers after register
>allocation
>
>> From: "Kevin B via llvm-dev Smith" <llvm-dev@lists.llvm.org>
>> To: "Krzysztof Parzyszek" <kparzysz@codeaurora.org>, llvm-
>dev@lists.llvm.org
>> Sent: Thursday, December 10, 2015 12:04:49 PM
>> Subject: Re: [llvm-dev] Allowing virtual registers after register
>> allocation
>>
>> >From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf
>> >Of
>> >Krzysztof Parzyszek via llvm-dev
>> >Sent: Thursday, December 10, 2015 9:47 AM
>> >To: llvm-dev@lists.llvm.org
>> >Subject: Re: [llvm-dev] Allowing virtual registers after register
>> >allocation
>> >
>> >>
>> >> But there is lots of code that assumes that it can iterate over
>> >> all physical
>> >registers in some class. My thought had been that you don't want
>> >to
>> >introduce infinite physical register sets because this assumption
>> >of
>> >enumerability is broken (as is the assumption that the size does
>> >not
>> >dynamically change). Thoughts?
>> >
>> >The post-RA code may not be free from assumptions that all
>> >virtual
>> >registers are gone. For example, such code may not handle
>> >subregisters,
>> >since PhysReg:Sub can always be collapsed to another physical
>> >register.
>> > For virtual registers, there is no such guarantee.
>> >
>> >I think it would be a lot clearer if we introduced infinite
>> >register
>> >classes, with all the properties of physical registers (except
>> >those
>> >obviously related to finiteness). Having virtual registers after
>> >RA
>> >sounds like a huge hack.
>>
>> I definitely agree that having virtual regs after RA sounds like a
>> hack.
>>
>> But I also don't know why it would be desirable to introduce
>> infinite
>> register classes. The WebAsm folks are already saying that they
>> would like to do register allocation to target a fixed/limited
>> number (might be large though) of "virtual registers". So,
>> instead
>> of calling these virtual registers, why not call them physical
>> registers, and have a fixed number of them, that corresponds to
>> the
>> number that is desired to allocate to. Or, you could have the
>> number of registers to use be run-time selectable in some manner
>> by
>> just having the physical register set be larger than is ever
>> planned
>> to be used by that particular CG, and having run-time controls to
>> restrict the set of allocatable physical registers.
>
>You're right that we don't really want infinite register classes,
>but rather, we
>want "expandable" ones. Making extra-large register classes that are
>restricted by having most of the registers in the reserved set,
>however,
>seems just as much a hack (and a worse one in many ways).

Whether it’s a hack or not depends on the sizes in question. Existing
X86 already has this property for 64 bit, there are registers which
simply don't exist
unless the target arch is 64 bit. If WebASM folks are thinking of
allocating down to something like 32 or 64 registers, with maybe a
maximum of 128 or 256, then
making some portion of this reserved when a tighter allocation (only
coloring to 16 or 32) seems completely doable (and natural) using
all existing infrastructure, with nothing
special needed.

No argument from me on this point, however, whether or not a relatively-small fixed number is acceptable I don't know. What does seem to be the case, however, is that they need some kind of register use cost function which makes the use of each new register increasingly expensive and/or the ability to dynamically change the number of registers that are reserved at any given time. The former is probably better.

If getting into significantly larger numbers, then I
can see where this might be considered a hack. But unless you are
talking about multi-thousands,
it does beg the question about what the extra generality is worth
compared to the engineering effort to design, implement and support
it.

This is exactly why I was in favor of reusing the existing infrastructure for virtual registers.

-Hal

> >You're right that we don't really want infinite register classes,
> >but rather, we
> >want "expandable" ones. Making extra-large register classes that are
> >restricted by having most of the registers in the reserved set,
> >however,
> >seems just as much a hack (and a worse one in many ways).
>
> Whether it’s a hack or not depends on the sizes in question. Existing
> X86 already has this property for 64 bit, there are registers which
> simply don't exist
> unless the target arch is 64 bit. If WebASM folks are thinking of
> allocating down to something like 32 or 64 registers, with maybe a
> maximum of 128 or 256, then
> making some portion of this reserved when a tighter allocation (only
> coloring to 16 or 32) seems completely doable (and natural) using
> all existing infrastructure, with nothing
> special needed.

No argument from me on this point, however, whether or not a
relatively-small fixed number is acceptable I don't know. What does seem to
be the case, however, is that they need some kind of register use cost
function which makes the use of each new register increasingly expensive
and/or the ability to dynamically change the number of registers that are
reserved at any given time. The former is probably better.

A relatively-small fixed number is indeed not acceptable. We have a virtual
ISA which is general-purpose and is not inherently afraid of having
thousands of registers and perhaps more, if that's what the program
actually needs. We do use coloring to reduce the number as we can, but
coloring can't always fix everything without spilling.

And whether we ever have LLVM spill, as opposed to just using more
registers, is a decision we'd like to make based on the needs of the
platform, not based on any limitations of LLVM. Right now, the working
assumption is that LLVM should not spill.

> If getting into significantly larger numbers, then I
> can see where this might be considered a hack. But unless you are
> talking about multi-thousands,
> it does beg the question about what the extra generality is worth
> compared to the engineering effort to design, implement and support
> it.

This is exactly why I was in favor of reusing the existing infrastructure
for virtual registers.

The virtual register infrastructure in LLVM turns out to be a very close
fit for our needs. The main alternative is effectively to take LLVM's
physical register concept and evolve it in the direction of being more like
its virtual register concept in several significant ways. It's not clear
that LLVM should really want two different concepts with so much
similarity, but also numerous subtle differences.

Dan

> Whether it’s a hack or not depends on the sizes in question. Existing
> X86 already has this property for 64 bit, there are registers which
> simply don't exist
> unless the target arch is 64 bit. If WebASM folks are thinking of
> allocating down to something like 32 or 64 registers, with maybe a
> maximum of 128 or 256, then
> making some portion of this reserved when a tighter allocation (only
> coloring to 16 or 32) seems completely doable (and natural) using
> all existing infrastructure, with nothing
> special needed.

No argument from me on this point, however, whether or not a
relatively-small fixed number is acceptable I don't know. What does seem to
be the case, however, is that they need some kind of register use cost
function which makes the use of each new register increasingly expensive
and/or the ability to dynamically change the number of registers that are
reserved at any given time. The former is probably better.

> If getting into significantly larger numbers, then I
> can see where this might be considered a hack. But unless you are
> talking about multi-thousands,
> it does beg the question about what the extra generality is worth
> compared to the engineering effort to design, implement and support
> it.

This is exactly why I was in favor of reusing the existing infrastructure
for virtual registers.

We aren't talking about have 32 or 64 virtual registers. Most functions
should have just a few, but we are talking about having as many as the
compiled code needs: the VM will spill what's needed to a shadow stack
that's not user-accessible. This has interesting security properties, lets
the VM do this as optimally as it sees fit for the target ISA, and would
otherwise require the LLVM backend to emit an alloca which we then
translate to a heap allocation to the user-accessible "stack" (which lives
in their heap).

Put another way: it seems sensible for a virtual ISA to have virtual
registers :wink:

I think Derek's proposal is sensible in that it doesn't have much cost to
the LLVM code base, and NVPTX shows precedent for working around that
limitation. We'd like virtual ISAs to be supported as first-class targets,
that has a small cost in LLVM's generality but should help remove hacks in
other virtual ISA implementations.

To say this first: This whole discussion about using virtregs until emit or having growable physregs is hard to argue without actually having experience trying to go either way.

Problems when using virtregs throughout the backend until emit time:
- The MC layer is using MCPhysReg (which is an uint16_t) and would need retrofitting to support virtregs
- VirtRegs are assumed to have a definition, physregs can appear "out of thin air" in some situations like function parameters, or exception objects appearing in a register when going to a landingpad.
- VirtRegs are assumed to be interchangeable, replaceing vreg5 with vreg42 shouldn't affect the program semanic (given they both have the same register class and we have no other defs/uses of vreg42), if you use virtregs for parameter passing this won't be true anymore
- regmask clobbers only affect physregs
(- You cannot reuse the existing regalloc infrastructure, but IMO that's not a good idea anyway for virtual ISAs)

Problems when allowing the dynamic creation of physregs:
- The current assumption of all register being known at tbalegen time will mean that we probably need bigger changes to support dynamically growing physreg lists and it may take a while until we have flushed out all places that relied on a fixed-register number assumption.
- You probably do not want to compute/modify some information like register class subsets/supersets. However as far as I can see we do not need subregister support for the virtual ISA usecase and may be fine just not allowing the combination of subregs with dynamic physreg creation.

Non-Issues:
- Liveness calculation should work as well with virtregs as with physregs

All in all it seems to me like using virtregs until emission time may take less engineering effort to a point where it is 95% working, but will be a pain to maintain in the long term because we suddenly have physreg like semantics on virtregs for some targets (but not for "normal" ones).

- Matthias