I’m trying to enable the hexagon LLVM assembly parser. It seem like there is a lot of work that has been done to make this parsing straightforward.
Hexagon assembly does not follow the “Mnemonic Rx Rx …” format that is expected by the assembly parsing infrastructure, represented by:
StringRef Mnemonic = ((ARMOperand*)Operands)->getToken();
This Mnemonic location assumption applies to both the Tablegen Backend AsmMatcherEmitter processing, and the .inc file it produces where MatchInstructionImpl is the entry point by which the assembly input is parsed.
However Hexagon assembly has some features that make it more readable, such as r1 = r2, or if(r1) r2 = mem(#immediate). This makes taking advantage of the existing LLVM code difficult.
Currently, I see two options.
One is to preformat the assembly string(s) obtained from the td files so that it is matches the format that the tablegen backend expects, and also preformat the assembly input so that it can be matched against the preformatted assembly strings.
The other is to write a whole new TD backend that doesn’t rely on the Mnemonic location assumption. And hope someday to merge this backend with the current AsmMatcherEmitter.
I am leaning toward the latter. The other seems like it will create many more problems in the long run. Any thoughts, comments, or recommended directions are appreciated.