Considering the issue to leverage i32 series instructions, ⚙ D107658 [RISCV] Teach isel to select ADDW/SUBW/MULW/SLLIW when only the lower 32-bits are used.. Also some other target DAG combine actions such as combining any_ext node to leverage ADDW/SUBW/…
I think those effects are caused by originally and naturally treating i32 type illegal in 64-bit target for RISCV. And it makes much following work to add patches.
Is it really a good way to handle i32 type in 64-bit mode RISCV target?
Could it be just like what PowerPC does that make both i32 and i64 are legal in DAG selection phase?
I’ve given this some thought recently. PowerPC, unlike RISCV64 has different register classes for 32 and 64 and they have a full set of instructions for both register classes. RV64 only has 64-bit AND/OR/XOR/compares and the ABI requires 32-bit values to be passed sign extended.
If we make i32 legal, then all values between basic blocks will be i32 instead of i64. We currently try to insert sext_inreg when we type legalize so that we can make use of SelectionDAGBuilder’s ability to propagate known sign bits across basic blocks using AssertSExt nodes. If we start using i32 across basic blocks we lose this and will need a MachineIR pass to clean it up. Granted we’re not perfect today and maybe should have such a MachineIR pass anyway.
I have thought about adopting the approach of Mips64 where i32 is legal but the upper bits of the register are always sign extended. I believe their 32-bit instructions explicitly check this in hardware. This could simplify some of the cross basic block issues because now everything is always sign extended. It does require some tricky things like an i64->i32 truncate must be emitted as a sext.w in order to enforce the sign extension rule. These might be unnecessary depending what the using instructions are, but again SelectionDAG’s single basic block limitation makes this difficult to see. So we would again probably need a MachineIR pass to do cleanup.
I don’t know what the right answer is.
I prefer the what Mips64 does. Firstly, if I understand correctly the RV spec requires the sign-extended result in GPR for i32 type every moment and every instruction. I guess the requirement speeds up or simplify the hardware implementation which assume the input data of every single instruction is sign-extended. So I think i64->i32 truncate emitted as a sext.w is not tricky but normal. Secondly, could we just have GPR for both i32 and i64 without subregister GPR32? No subregisters created in td file.
For now, there are several effects around the codebase to handle the issue instead of handling in one single place or pass. And there is an important part is custom instruction in RISCV target. I am handling custom arithmetic instructions which also involve in W version issue. Since the change will make big affect, anyone else could give some advice and comments?
It's definitely a pain point for the RISC-V backend, though I would
highlight that making i32 a legal type and therefore duplicating all
instruction definitions for RV32 and RV64 has its own drawbacks
(repetition, possibility of surprising codegen differences for 32 vs
64-bit due to missing the duplicated instructions in instruction
patterns or elsewhere in the backend). You could of course argue that
those issues may be easier to debug and reason about than some of the
hassles with *W instructions.
Please see <https://lists.llvm.org/pipermail/llvm-dev/2018-October/126690.html>
for the initial RFC and discussion on this.
I think ideally we would be able to maintain a single set of
parameterised instruction definitions (and I'd be keen to discuss any
ideas on making this easier to work with), but obviously if the
current implementation approach is causing more problems than it
solves we should be pragmatic.
I think what you mentioned is XLenVT functional infra that avoids the duplication of the same non-W instruction for RV32 and RV64. Could it be possible to keep XLenVT infra for non-W instruction to make it only one definition for both RV32 and RV64, and also one W instruction definition which only for RV64?
I re-active this email thread talk because found a related issue recently.
After the thread talk before, I have no wondering the design that i32 type is illegal in llvm codegen(backend). But is this needed to be consistent for layout string? Now, the layout string is “e-m:e-p:64:64-i64:64-i128:128-n64-S128” for 64-bit arch. As it indicates that native integer type is only i64 (-n64-), it influences the predictor(“DL.isLegalInteger”) result which is used across many middle-end passes. This stops from some optmization such as LSR. Sepecificly, ~157th line, at IVUsers.cpp.
// LSR is not APInt clean, do not touch integers bigger than 64-bits.
// Also avoid creating IVs of non-native types. For example, we don’t want a
// 64-bit IV in 32-bit code just because the loop has one 64-bit cast.
uint64_t Width = SE->getTypeSizeInBits(I->getType());
if (Width > 64 || !DL.isLegalInteger(Width))
I draft some ways to solve this issue. Could we just use n32:64 in layout string without degression of some other passes(optimization) use this info? Or we need some API like