[ptx] Propose a register class naming convention change

Hi,

Current register class naming has a confusing prefix letter 'R' (it is
my bad), such as the first 'R' of RRegu32 (for unsigned 32-bit
registers).

I propose a 'Reg' + type name naming convention for register classes; such as:
  Regu16, Regu32, Regf32, Regf64
With one exception for predicate registers (capitalized first letter of 'pred'):
  RegPred

Since predicate registers are special in the way that they can't be
passed as arguments or load from/store to memory, I think a little
name convention exception for it is okay.

What do you think?

If no objection, I will start making the change.

Regards,
Che-Liang

Current register class naming has a confusing prefix letter 'R' (it is
my bad), such as the first 'R' of RRegu32 (for unsigned 32-bit
registers).

I propose a 'Reg' + type name naming convention for register classes; such as:
  Regu16, Regu32, Regf32, Regf64
With one exception for predicate registers (capitalized first letter of 'pred'):
  RegPred

  Seems fine to me. And just curious, what the first 'R' of RRegu32
means?

Regards,
chenwj

That's fine with me. Unless there's a particular reason for it I would suggest perhaps changing the immediate syntax as well to swap it round, so it would be Immi32, Immi64, Immf32, etc. It doesn't bother me that much the way it currently is, but when there are lots of operations taking a register and an immediate, representing them in the same way might be a little more consistent?

Personally, I think I also might prefer an underscore to make it more readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That's maybe just my own preference, so feel free to do it as you've suggested!

Dan

Che-Liang Chiou wrote:

That’s fine with me. Unless there’s a particular reason for it I would suggest perhaps changing the immediate syntax as well to swap it round, so it would be Immi32, Immi64, Immf32, etc. It doesn’t bother me that much the way it currently is, but when there are lots of operations taking a register and an immediate, representing them in the same way might be a little more consistent?

Personally, I think I also might prefer an underscore to make it more readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That’s maybe just my own preference, so feel free to do it as you’ve suggested!

Dan

I’ve been considering the way registers are represented in the PTX back-end quite a bit lately, and I think we need to re-consider the way we handle registers in the PTX back-end. As is, we assume a fixed register set of typed and sized registers, which is more-or-less what the LLVM code generation framework expects. However, PTX is really a special-case target in that the register space is “infinite” and not really typed (yes, PTX allows register types, but I do not believe that is mandatory). The infinite nature of the register space gives us a few problems:

  1. We are currently constrained by the number of registers we specify in PTXRegisterInfo.td
  2. The LLVM register allocators are not really solving the right problem
  3. We miss opportunities for register re-use
    I’m sure there are more, but those are the ones I am thinking of now.

To solve (1) (and (3) to some degree), I propose we get rid of register types and instead use .b{16, 32, 64} and .pred as our register classes. I cannot think of a case where specifying a register class (u32, f32, etc.) is required. In fact, manually modifying my own PTX code to always use .b* registers has not affected anything. This would both simplify the back-end and allow the LLVM register allocator to re-use registers across different data types (may or may not be a win depending on how good the ptxas register allocator is).

Solving (2) seems to be a much more difficult problem. The current implementation of register allocation assumes a fixed register space, and allocates registers as best as it can while introducing spill code when it has to. For PTX, the problem is a bit different. Instead, we should assume an infinite register space and minimize the number of registers required without introducing spill code. It is the responsibility of ptxas to do the final register allocation and spill code creation. I see two potential solutions to this:

  1. Keep the current fixed register space and emit spill code that really just adds an additional register and copies data between registers for spills
  2. Implement a new register allocation strategy that ties into the existing infrastructure to satisfy our requirements
    Solution (1) seems the easiest to implement, but I worry that ptxas may not be able to interpret what is really happening. I believe doing PTX-level register allocation is at least partially responsible for the speed-ups I have observed when comparing against nvcc-generated code. That leaves (2) as the preferred method, but I do not know enough about the inner-workings of the LLVM register allocations to properly assess how difficult this would be.

Any thoughts?

By the way, I’m perfectly okay with the name change :slight_smile:

Justin Holewinski wrote:

That’s fine with me. Unless there’s a particular reason for it I would suggest perhaps changing the immediate syntax as well to swap it round, so it would be Immi32, Immi64, Immf32, etc. It doesn’t bother me that much the way it currently is, but when there are lots of operations taking a register and an immediate, representing them in the same way might be a little more consistent?

Personally, I think I also might prefer an underscore to make it more readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That’s maybe just my own preference, so feel free to do it as you’ve suggested!

Dan

I’ve been considering the way registers are represented in the PTX back-end quite a bit lately, and I think we need to re-consider the way we handle registers in the PTX back-end. As is, we assume a fixed register set of typed and sized registers, which is more-or-less what the LLVM code generation framework expects. However, PTX is really a special-case target in that the register space is “infinite” and not really typed (yes, PTX allows register types, but I do not believe that is mandatory). The infinite nature of the register space gives us a few problems:

  1. We are currently constrained by the number of registers we specify in PTXRegisterInfo.td
  2. The LLVM register allocators are not really solving the right problem
  3. We miss opportunities for register re-use
    I’m sure there are more, but those are the ones I am thinking of now.

To solve (1) (and (3) to some degree), I propose we get rid of register types and instead use .b{16, 32, 64} and .pred as our register classes. I cannot think of a case where specifying a register class (u32, f32, etc.) is required. In fact, manually modifying my own PTX code to always use .b* registers has not affected anything. This would both simplify the back-end and allow the LLVM register allocator to re-use registers across different data types (may or may not be a win depending on how good the ptxas register allocator is).

Yep, definitely let’s do this! I tried to do something similar before, but didn’t realise the operand types and register types didn’t have to match. That’ll surely improve our register reuse. The only minor disadvantage I can see is that the resulting ptx will be a little cryptic to debug, but that’s not an issue.

As for the register allocation, I’m not familiar enough to be able to comment on the feasibility either, but the second option sounds like the preferred one.

Dan

2011/5/13 Dan Bailey <drb@dneg.com>

Justin Holewinski wrote:

That’s fine with me. Unless there’s a particular reason for it I would suggest perhaps changing the immediate syntax as well to swap it round, so it would be Immi32, Immi64, Immf32, etc. It doesn’t bother me that much the way it currently is, but when there are lots of operations taking a register and an immediate, representing them in the same way might be a little more consistent?

Personally, I think I also might prefer an underscore to make it more readable for new users (Reg_u32, Reg_pred, Imm_i32, Imm_f32, etc). That’s maybe just my own preference, so feel free to do it as you’ve suggested!

Dan

I’ve been considering the way registers are represented in the PTX back-end quite a bit lately, and I think we need to re-consider the way we handle registers in the PTX back-end. As is, we assume a fixed register set of typed and sized registers, which is more-or-less what the LLVM code generation framework expects. However, PTX is really a special-case target in that the register space is “infinite” and not really typed (yes, PTX allows register types, but I do not believe that is mandatory). The infinite nature of the register space gives us a few problems:

  1. We are currently constrained by the number of registers we specify in PTXRegisterInfo.td
  2. The LLVM register allocators are not really solving the right problem
  3. We miss opportunities for register re-use
    I’m sure there are more, but those are the ones I am thinking of now.

To solve (1) (and (3) to some degree), I propose we get rid of register types and instead use .b{16, 32, 64} and .pred as our register classes. I cannot think of a case where specifying a register class (u32, f32, etc.) is required. In fact, manually modifying my own PTX code to always use .b* registers has not affected anything. This would both simplify the back-end and allow the LLVM register allocator to re-use registers across different data types (may or may not be a win depending on how good the ptxas register allocator is).

Yep, definitely let’s do this! I tried to do something similar before, but didn’t realise the operand types and register types didn’t have to match. That’ll surely improve our register reuse. The only minor disadvantage I can see is that the resulting ptx will be a little cryptic to debug, but that’s not an issue.

I’m probably going to get started on this over the weekend. Che-Liang, since I will be re-writing most of the register code anyway, I’ll go ahead and change to the new naming convention.

Hey,

Justin, if you are going to rewrite most of the register codes, please
also change the naming convention. Thanks a lot.

I had thought about register allocation problem. Status quo is merely
a work-around that works. I agree with you that we probably have to
write a new register allocation (or new RegisterInfo class) in the
long run.

Regards,
Che-Liang