I have a question about RISCVCommonTableGen

I open the file CMakeLists.txt which is in the llvm/llvm/lib/Target/RISCV



tablegen(LLVM RISCVGenAsmMatcher.inc -gen-asm-matcher)
tablegen(LLVM RISCVGenAsmWriter.inc -gen-asm-writer)
tablegen(LLVM RISCVGenCompressInstEmitter.inc -gen-compress-inst-emitter)
tablegen(LLVM RISCVGenDAGISel.inc -gen-dag-isel)
tablegen(LLVM RISCVGenDisassemblerTables.inc -gen-disassembler)
tablegen(LLVM RISCVGenGlobalISel.inc -gen-global-isel)
tablegen(LLVM RISCVGenInstrInfo.inc -gen-instr-info)
tablegen(LLVM RISCVGenMCCodeEmitter.inc -gen-emitter)
tablegen(LLVM RISCVGenMCPseudoLowering.inc -gen-pseudo-lowering)
tablegen(LLVM RISCVGenRegisterBank.inc -gen-register-bank)
tablegen(LLVM RISCVGenRegisterInfo.inc -gen-register-info)
tablegen(LLVM RISCVGenSearchableTables.inc -gen-searchable-tables)
tablegen(LLVM RISCVGenSubtargetInfo.inc -gen-subtarget)


I have a problem, I can’t find where to use RISCVCommonTableGen.Because I want to change RISCV.td to then the risc-v instructions, but I don’t want to make changes in the source code, I want to inlcude RISCV.td in my code, regenerate the Tablegen file, then use the generated Tablegen file,but I’m not sure where the Tablegen file is used, I used vscode and I couldn’t find it. If anyone can help me, I would be grateful.

TableGen generates lots of C++ header-like files in build/lib/Target/RISCV/RISCVGen*.inc, and they get included across the RISC-V backend (grep for “RISCVGen” to pick up most of them).

It would be almost impossible to add something from outside the LLVM tree in a way that doesn’t conflict with the existing definitions. There are literal hard-coded arrays of all known instructions in there that everything else will be considering canonical.

So realistically you’re probably going to have to modify the source.

Oh!But the version of llvm will change and I don’t want to extend the instructions on a fixed version, is there any other way to do it here?Thanks!

As far as I know there’s no way to do that at the moment.

The closest feature I know is that some assembler dialects accept something like .inst 0x12345678 to directly encode an instruction into the stream if you know the exact bits you want. It’s a very niche feature.

Doing it properly would involve distributing hooks throughout the backends to check a second set of side-tables that might be provided by the user, which is a big undertaking. In reality most people add the instructions to OSS LLVM, or fork it I think.