I would like to ask about feedback for a refactor task of the TableGen backends.
The Capstone project (a lightweight disassembler) was in the need of customized output from TableGen backends. The emitted code had to be in C and some of it had to be altered.
To achieve this I separated the syntax output from the code generation logic in some backends.
Now I would like to ask you about feedback and, if possible, get it upstreamed. Of cause I am happy to make more changes to it if requested.
You can find the patch in this review:
The code emitted by TableGen backends is a useful resource for non LLVM tools.
(Capstone makes heavy use of this code to disassemble opcodes without the need to implement a complete disassembler on its own).
The problem is that the backends can only emit C++ code and the user has no control which parts are emitted and which are not. Altering the output is also not possible.
For example, there is no option for the user to emit only arrays and not the functions and enums.
Emitting the code in another language than C++ is also not possible. Although it could be useful if projects need the code in Java, C or other syntax.
Writing a new backend for these use cases is not necessarily an option, if the code needed is the one emitted by a specific backend. Writing such a “new” backend would simply duplicate the original one. And updating this duplicated backend with each LLVM release is maintenance heavy.
Adding an abstraction layer between the code generation and the syntax output would be helpful here.
For example, if the backend emits a specific
enum X it generates the information and calls a
Printer::emitEnumX(<eunm_data>) method. This method prints the syntax to the output stream.
If the syntax output must be altered, it is only necessary to override this
emitEnumX method of the
Printer class (for an implementation see the review above).