Hi everyone, we’ve been prototyping an optimizing assembler for Hexagon for the purpose of updating legacy assembly for new architectures, packet rules, and instruction latencies. It seems like others would be interested in using this and we’re looking for any related feedback: has it been attempted before, who’s interested, or any general suggestions.
We’re using the MachineFunctionInitializer created to support MIR in order to process the MC and construct MachineFunctions.
Currently the workflow sketch is:
-
Use flags and code from llvm-mc to assemble a file in to an MC representation.
-
Use a target-specific MachineFunctionInitializer to convert the MC → MI and write out the contents to a MachineFunction.
-
Use flags and code from llc to run the MI through compiler passes for reemission.
We’d need to either attach it to an existing tool or create a new one and pick somewhere in the pass pipeline to start running passes.
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Hi Colin,
AMDGPU assembler would definetely benefit from this, sounds very interesting. Year ago I tried to make MC->MI conversion but stopped at some point, so I didn’t faced with other potential difficulties there such as building BBs etc.
Valery
Hi Colin,
I am definitely interested!
The way I was seeing this happening is by changing incrementally the parser of the MIR format. Basically, I’d like the parser to get smarter and smarter to a point where it could understand assembly mnemonics and build the MachineFunction. The rest of the infrastructure would stay the same.
For different reasons, Matthias already improved the parser to avoid specifying easily computable information. The idea is to continue in that direction.
My 2-c.
Cheers,
-Quentin
I'm not sure that this is possible. The MIR format was meant to represent the program on the MachineInstr level and is more or less the same for all targets. The "optimizing assembler" would take the .s file as its input, the structure of which will differ from one architecture to the next. Not only that, but the format of an individual assembly instruction may be nontrivial to parse. For example, Hexagon instructions don't follow the typical "mnenomic op, op, ..." format, even though the MIR representation for Hexagon does.
-Krzysztof
The way I was seeing this happening is by changing incrementally the parser of the MIR format. Basically, I’d like the parser to get smarter and smarter to a point where it could understand assembly mnemonics and build the MachineFunction. The rest of the infrastructure would stay the same.
I'm not sure that this is possible. The MIR format was meant to represent the program on the MachineInstr level and is more or less the same for all targets. The "optimizing assembler" would take the .s file as its input, the structure of which will differ from one architecture to the next. Not only that, but the format of an individual assembly instruction may be nontrivial to parse. For example, Hexagon instructions don't follow the typical "mnenomic op, op, ..." format, even though the MIR representation for Hexagon does.
Sure. That being said the MIR parser could invoke the target MC parser for the specific syntax. I admit the line is blurry between an approach that does asm -> MC -> MI and asm -> MIR with such parser.
The reason I still think it is doable is because MIR already does "mnemonic parsing" for MI opcodes and that does not seem to be that farfetched to do it directly on asm mnemonics.
We might want target specific asm to "MIR asm” kind of converter (like transforming op a, b,c in a = op b, c) if that’s really too complicated.
The bottom line is given how much logic those two things seem to share, I think we need to give it more though before we rule out that they can’t be the same tool.
With this explanation it looks like they could certainly be the same tool. My original understanding about using MIR did not include the use of asm parsers and so it seemed that there would be a lot of duplicated effort.
-Krzysztof