I’m trying to parse an x86 assembly file (.s) and obtain an AST from it.
Inspecting the tools I found that llvm-mc does something in the way that I want; it takes a .s file and it assemblies it, or generates an object file. Reading the implementation I found that it calls the AsmParser class to parse the .s using the run method, which it calls the parseStatement method and this ends up calling the method parseInstruction of the Target Parser (x86AsmParser in this case).
I’ve noted that in the process it keeps the information of the parsed instruction in the ParseStatementInfo struct, more specifically a vector of MCParsedAsmOperand called Operands, and after parsing the instruction it emits it. No AST created.
Does the MC layer use an AST for parsing assembly or simply parse and emits instruction by instruction?
Is there any way to access the parseStatementInfo struct data after each parsing of instruction?
Do you have any documentation about MCParsedAsmOperand?
Thanks.
The latter. The instructions are parsed and then emitted. There is no need for an AST, because semantic analysis is not necessary.
The documentation in the header file MCParsedAsmOperand.h is quite good. I am not aware of other documentation.