[RFC] fix tablegen for HwMode

Background

Recently, I encountered a problem: I need to have two different encodings for the same instruction. The main reason for doing this is that many of our subtargets share this instruction, and on the newly added subtargets, we also need to use this instruction, but the encoding has changed compared to the old one. The reason I don’t define a new instruction specifically for this new subtarget is that we’ve already done a lot of optimization for this instruction, and if we define a new instruction, we would have to reapply these optimizations to the new instruction, which would create a significant amount of work.

I tried searching for information to solve this problem and found that the framework supports the EncodingByHwMode feature, which allows an instruction to be automatically selected with different encodings based on different subtargets during the MC phase. But during the use, I encountered a lot of problems, and these problems were not only encountered by me (as discussed in my previous discussion:pre-discussion). It seems that EncodingByHwMode has a significant defect and cannot be used properly.

I have tried to fix the EncodingByHwMode feature, and it seems to be working fine so far. In this RFC, I will describe the design of my fix, focusing on the four relevant defects which I have identified (respectively, a limitation in MCCodeEmitter, redundancy in GenDisassemblerTables, the conflict with RegInfoByHwMode, and the lack of support for DefaultMode)

1. MCCodeEmitter Limitation

When we encode instructions in MC phase, we will call fuction ‘getBinaryCodeForInstr’ to get the encoding for current instruction. ‘getBinaryCodeForInstr’ will get the base encoding from ‘InstBits’ table for this instruction, and then parse the instruction’s operations to fill in the encoding bits.

When EncodingByHwMode is enabled (As long as one instruction is set with the EncodingByHwMode attribute), ‘InstBits’ table will be replaced by ‘InstBits_HwModeA’ and ‘InstBits_HwModeB’(Assuming there are only two types of hwmodes A/B), and ‘InstBits’ has changed from a const table pointer to an unassined pointer. At the begining of function ‘getBinaryCodeForInstr’, we will get the HwMode of current subtarget, and then let ‘InstBits’ point to the table to which current hwmode belongs. Finally we get the base encoding from the table to which ‘InstBits’ point and fill in the encoding bits by parseing the instruction’s operations. This logic works fine for those subtargets who are given hwmode attribute, but for the remaining subtargets who are not given hwmode attribute, this logic will trigger ‘llvm_unreachable(unrecognized hwmodes)’.

My solution to this problem is to sink the logic of obtaining hwmode of subtargets into each instruction level. I will generate a default table in addition to the hwmode tables, and all instructions will get the base encoding from the default table firstly. If this instruction is set with the EncodingByHwMode attribute, then I will check the hwmode of current subtarget(It is obvious that the subtarget using this instruction must have been assigned the hwmode attribute), and get the real base encoding from the hwmode table.

2. Excessive duplication in GenDisassemblerTables

When we generate DisassemblerTables, the framework will traverse all instructions to get the encodings. The encodings with the same DecoderNamespace will finally be collectively placed in a DecoderTable.

After EncodingByHwMode is enabled(assuming there are two hwmodes A/B), all DecoderTables will be generated in A and B versions, corresponding to hwmode A and hwmode B respectively(eg, decoderTableArm → decoderTableArm_HwModeA/decoderTableArm_HwModeB). Among these tables, only the tables containing the instructions who are assigned the EncodingByHwMode attribute need to be generated in A and B versions, while the other tables do not need to be generated at all, since they are exactly the same on different HwModes.

When the framework traverses all instructions, we now collect the hwmodes and DecoderNamespaces of the instructions and store them in a map (in which the keys are namespcaces and the values are vectors of hwmodes). Then when generating the DecoderTables, we look up the DecoderNamespace of the current instruction in the map, and check whether there are valid hwmodes. If so, DecoderTables with different hwmodes versions will be generated. This will avoid the unnecessary tables duplications.

3. Conflict with RegInfo/ValueTypeByHwMode

Hwmode is used on three modules: ValueTypeByHwMode, RegInfoByHwMode and EncodingByHwMode. By now, a subtarget can only be assigned one hwmode. If multiple hwmodes are assigned to a subtarget, only one of them will be retained, and the rest will be discarded. This means that if I add a hwmode attribute to a subtarget to control RegInfoByHwMode, then if I want to enable EncodingByHwMode to this subtarget, I must use the same hwmode as RegInfoByHwMode. For example, Hexagon defined two hwmodes, Hvx128 and Hvx64, to control the ValueType and RegInfo. One day, they suddenly wanted to enable EncodingByHwMode for a subtarget, and the thing that hwmodes cannot coexist required them to use Hvx128 or Hvx64 to control EncodingByHwMode for that subtarget. This can be very strange, as encoding and reginfo are in two dimensions but use the same hwmode.

In order to make hwmode more flexible to use and facilitate the addition of more hwmode related features in the future, I changed the storage and usage of hwmode to a bitwise format. A subtarget can have multiple hwmodes to manage valueTypeByHwMode, RegInfoByHwMode and EncodingByHwMode separately, but for the same module such as EncodingByHwMode, only one hwmode can be set. When parsing the hwmode in each module, bitwise AND operation is used to obtain the hwmode of the current module, so that the hwmode can coexist independently. Of course, you can also use one hwmode to control the three modules: ValueTypeByHwMode, RegInfoByHwMode, and EncodingByHwMode. However, as I said, you cannot use two hwmodes to control RegInfoByHwMode.

Overall, I have configured the storage and parsing of hwmode to be related to bit operations, which allows a single hwmode to control multiple hwmode-related modules, as well as multiple hwmodes to individually control different hwmode-related modules, but multiple hwmodes cannot be used to control one hwmode-related module.

4. Lack of support for DefaultMode

I fixed the issue where DefaultMode could not be used for EncodingByHwMode. The modification for this part is relatively simple, it merely involved adding some judgment logic pertaining to DefaultMode.

Summary

I have resolved the above four issues concerning EncodingByHwMode. Among them, the third point, which involves changing the parsing of hwmode into bit-related operations, might be somewhat contentious. However, the other adjustments are essential, without them, EncodingByHwMode would not function properly.

In the latest main branch, I noticed someone submitted a patch that fixed an issue with the ‘IndexOfInstruction’ logic under DecoderEmitter, but the problem I mentioned above still persists. Therefore, I want to address it through this patch.

Please share more of your thoughts, thank you!

pull requset : [TableGen] fix tlbgen for EncodingByHwMode by superZWT123 · Pull Request #84906 · llvm/llvm-project · GitHub

In the AMDGPU backend, we have the same issue of instructions with different encodings on different subtargets. We solve it by creating multiple versions of the instructions. There is ‘pseudo’ version that is used throughout the backend from ISel until very late. This avoids having to repeat logic for each new encoding. When we convert the MachineInstr into MCInst, we convert the pseduo instructions into the ‘Real’ instructions for the subtarget, which have different encodings. This approach is definitely flexible, but perhaps there is a benefit to a solution using HwMode in terms of performance or ease of use.

This should always work for everyone. The “default” HwMode is internally represented by 0, so SomeTable[HwMode] should just be SomeTable for those targets that don’t use HwModes. If you’re seeing assertions, there is a bug somewhere.

All of that is by design. The HwMode was meant to enumerate hardware configurations, so for each specific HwMode you get everything that this HwMode implies. If you want to have independent parameters for register info and for encodings, you either need to invent a different parameter for the encodings, or define x*y different HwModes.

There have been attempts to utilize HwMode the way you describe, so maybe this whole HwMode thing should be re-engineered to support what you’re trying to do.

When I’ve thought about this problem before, I’ve felt that what’s really needed is a notion of HwMode sets. Then you can define all your different HwModes but use a set of non-overlapping sets (the global set of HwMode sets can and likely will overlap though) for each register/value/instruction. That way you can have A1B1, A1B2, A2B1 and A2B2 as your HwModes, but define A1 as the set A1B1+A1B2, for example. If you have HwModeSet I think that solves the real problem, and I would imagine is rather easier to implement than ripping out HwMode entirely and replacing it with something new that includes some idea of orthogonal attributes in it intrinsically.

That’s essentially equivalent to having several parameters, and treating HwMode as a combination of these. There would be a RegisterInfoMode, EncodingMode, etc, and HwMode would be the aggregate with all of these parameters.

Since these parameters would control different things, the existing code wouldn’t really need to be changed that much. It’s mostly a matter of renaming HwMode to the correct parameter, plus some updates in the TableGen handling of it.

I’m not even sure if each such parameter would need to be associated with subtarget features—as long as a subtarget has it fixed it could be based on any information that is available when the subtarget is being created.

Expanding pseudo instructions is definitely a flexible way to solve encoding conflicts, but this requires us to design this instruction as a pseudo instruction from the beginning. If for instruction that has been defined for a long time, changing it to the Pseudo form to solve encoding conflicts would be very troublesome in my opinion. And HwMode can solve this problem, it can be compatible with old definitions without making too many changes.
Thanks!

Hi kparzysz, for your first point, what I considered in the design was that in most scenarios, we only added HwModes selection to a few encoding conflicting instructions, and it is not necessary to perform HwModes detection on all instructions. And the HwModes we assigned to those encoding conficts instructions may be different. Sinking down the HwMode selection logic to per instruction level can better select HwMode, and the logic is clearer in my opinion. For specific details, please refer to my modifications to the testcase HwModeEncodeDecode3. td.

Regarding the second point, I personally prefer to handle the access to hwmode in a bitset manner, as it would make future additions more convenient for us. What I am considering is, if new features for hwmode are introduced in the future, using these features according to the old logic will inevitably require modifications to the previous hwmode configurations, which I see as unnecessary. In fact, for different features under HwMode control (RegInfo/Value Type/Encoding), their processing logic is independent and does not require a complete overhaul. What we need to change is only the access to hwmode, allowing them to obtain their respective hwmode information. In my current backend, I have adapted my changes and defined two subfeatures (including hwmode information) for a subtarget, controlling RegInfo and Encoding respectively, and they can function independently as normal. This change has been operational in our project for two months now, and everything seems to be running smoothly. A significant risk is the current change, which only allows the use of up to 32 hwmode types (due to the length of unsigned data types). In the future, std:: bitset can be used to increase the length of hwmode storage. Actually, I don’t really want HwModes to be abused. Excessive HwModes will inevitably lead to an increase in decoder tables (although my patch has reduced many unnecessary table duplications).
Thank you for the suggestion.

I think the key is how to let different features under HwMode control to obtain their respective hwmode information. My current patch actually allows us to define a subfeature set for storing hwmode information and assign it to specific subtargets, achieving the ‘HwMode sets’ you mentioned.