This RFC proposes autogenerating SDNode descriptions from *.td files. This includes:
- SDNode enumeration (
MyTargetISD::NodeTypeinMyTargetISelLowering.h) - SDNode names (
MyTargetLowering::getTargetNodeName()inMyTargetISelLowering.cpp) - SDNode verification functionality (
MyTargetISelLowering::verifyTargetSDNode(), before this patch it is only implemented for AArch64 for few nodes) - Generic and target-specific SDNode properties.
TL;DR The (stacked) patch is here: [RFC] TableGen-erate SDNode descriptions by s-barannikov · Pull Request #119709 · llvm/llvm-project · GitHub
Motivation
The benefits of autogenerating the descriptions are the usual:
- Avoiding code duplication (points 1 and 2 above).
- Generating code that is hard to write by hand (3) or difficult to maintain (4).
Also, the added node verification functionality proved to be useful in catching bugs (that may or may not be critical). I have already fixed a number of discovered issues (see my recent patches), but most of the them are left for target maintainers to resolve.
C++ changes
The patch introduces SDNodeInfo class, which is supposed to encapsulate all information about target-specific SDNodes. It is initialized with autogenerated arrays and provides a few accessors for querying basic node properties: the node name, the number of results / operands, whether the node has a memory operand or has strict floating-point semantics. It also features verifyTargetNode method that validates the given node against its tblgen description.
The class is modeled after MCInstrInfo / TargetGenInstrInfo, so the idea should be familiar.
Instances of the class are currently managed by target implementations of SelectionDAGGenTargetInfo. I considered adding a new factory method to TargetSubtargetInfo instead, but that was more changes. Another alternative could be putting everything in TargetLowering, but this class is already overloaded too much. SelectionDAGGenTargetInfo also serves as a proxy and extension point for targets that haven’t fully migrated to autogenerated descriptions.
TableGen changes
SDNode has got Flags and TSFlags fields similar to those found in Instruction. This allows targets to give properties to nodes (e.g. whether the node has strict floating-point semantics or has a “passthru” operand). Before this patch one had to write switches over node enum or carefully organize the enum members so that e.g., memory opcodes have values not less than FIRST_TARGET_MEMORY_OPCODE.
The new TableGen backend is simplistic. It collects all derived definitions of SDNode, filters them by target ISD namespace, and emits enum/arrays for the selected nodes.
Limitations and future work
- Obviously, TableGen cannot generate information for nodes not found in *.td files. Half of the targets have nodes (usually, a few) that are selected to target instructions by custom C++ code and thus don’t require tblgen descriptions as there are no tblgen Patterns. Targets are now encouraged to add tblgen descriptions for such nodes. This would avoid the need for overriding
SelectionDAGGenTargetInfomethods and would allow the nodes to be verified. - Some targets have more than one SDNode description for the same enum name. These SDNodes are sometimes incompatible (e.g. have different operand type constraints). Such nodes still get a generated enum member, but no information is generated for
verifyTargetNode. - Even though operand type constraints information is generated, it is currently not used by
verifyTargetNode. Experiments have shown that type constraints are far more often violated than other invariants (the number of operands/results, chain/glue operands), so I excluded this functionality from the final draft. It wasn’t very clean anyway. - The generated node names have the same “style” for all backends, which is “MyISD::OP_NAME”. Some targets may wish to use a different style, e.g. “my::op_name”. If there is a demand for this functionality, it can easily be implemented by adding a command line option to the tblgen backend.
- It would be great to generate descriptions for the generic SDNodes as well, but they are even more affected by the above issues, so this is left as a future work.
Summary of the draft PR
- The first commit implements SelectionDAGTargetInfo for target that sill use the base implementation.
- The second commit virtualizes
isTargetStrictFPOpcode/isTargetMemoryOpcodeso that target no longer need to care about the order of ISD enum members. (This change should be good by itself.) - The first commit adds a new TableGen backend and SDNodeInfo class. It also makes SelectionDAG use methods of this class before falling back to the old methods.
- The rest of the commits are one per backend. They are intended to show the general idea and I have no plans for getting them ready for commit. Specifically, I didn’t care to move comments from the node enumeration to *.td files. It is a tedious work and I will leave it for target maintainers (who could as well reject the whole idea of autogenerating node descriptions).
The patch is huge, but you can pick your favorite target and look if the changes satisfy your expectations. I’d suggest to look at RISCV backend as well as I’ve put a little more effort into it.