TableGen: Subtargets with different encodings

We have two instruction sets which are essentially asm-compatible but result in two very different encodings. Right now we have separate targets for them but for various reasons it might be convenient to make them two subtargets of a broader target.

Is there any support in TableGen for varying the encoding based on subtarget, similar to how one can have asm syntax variants? After fumbling around a bit I don’t see how to do it but thought I would ask to double-check.


There’s EncodingByHwMode (can’t see anything in-tree using this), or you can have two instruction definitions with different names, each predicated on the subtarget in use (see ZEXT_H_RV32/64 in the RISC-V backend for example).

1 Like

Yeah, I considered naming instructions differently but that defeats a lot of the gain from using subtargets in the first place, in that you have to remember to check both names when doing anything with some instruction.

Didn’t know about EncodingByHwMode, I’ll look into that. Thanks!


@dag Could you let me know your findings in this thread (if it is not too much of a trouble)? I would like to support your encodings in the refactored disassembler backend.

You could also take a look at how the AMDGPU backend does things. There are many instructions whose encodings changed between certain hardware generations. The high-level approach is that codegen mostly uses “pseudo” instructions that are named like you’d expect; but then there are also “real” instructions that have e.g. a _ci suffix indicating that they represent the real encoded instruction as introduced by the CI generation. The translation from pseudo instructions to real instructions happens very late, just before the final binary is emitted.

Unfortunately I probably won’t get to attempting this for quite some time, if ever. We have a lot more pressing things to do.