Hazard recognition using MCInst

Dear All,

I am following a flow to generate object files(.o) from input (.s assembly) files.
The input .s is given to AsmParser, which creates MCInst after matching instruction opcode.
These MCInst are converted to MCStream and then finally emitting to an object file using Target Code Emitter.

I am considering whether hazard recognition can be done on the list of MCInst, which I get after parsing .s file ??

The hazard recognition available with LLVM, uses ‘scheduling DAG of Machine Instructions’ and Alias Analysis data for checking dependencies.
In my case, do I need to create a Schedule DAG from MCInst ? And moreover, is is possible to create a schedule DAG this way ?

I have read about RevGen and other discussions, on creating LLVM IR from object code.
I understood that they suggest using a ‘code dictionary’ to convert to LLVM Machine Instruction & then create a CFG from the information avaiable.
There are limitation to this as the IR is incomplete, as pointed by in paper “A compiler level intermediate representation based binary analysis and rewriting system”,
mentioned in a discussion here (http://stackoverflow.com/questions/6981810/translation-of-machinecode-into-llvm-ir-disassembly-reassembly-of-x86-64-x86).

I am not sure whether creating scheduling DAG or some intermediate representation on top of MCInst for Hazard recognition makes sense.
Is suggestion on the right approach ?

Regards,
Pankaj

I personally think the MachineInstruction IR is right representation the moment you want to work with data dependencies and control flow. i.e. it doesn’t make any sense to add these features on top of MCInst. It’s just unfortunate and accidental that you need to link in all of CodeGen to use MachineInstrs.

That said, the hazard checker doesn’t need this. It just looks up MC-level information and populates a table. The current interface works on SUnits and MachineInstrs because that’s what the scheduler uses. But you could easily roll you own version that works directly from MCInst.

-Andy

Thanks for the valuable suggestion. I have a query further.

As per the inputs, I tried creating a scoreboard hazard recognizer which works on MCInst.
The hazard recognizer functions such as EmitInstruction & getHazardType now work on MCInstDesc instead of SUnit.

This hazard recognizer has additional scoreboards (apart from using reserved & required scoreboards), to track ’ register file’s read and write ports availability’ for FU to execute an operation which has register operands.

These additional scoreboards are updated, when ports are, ‘available and should be reserved’ for FU, to execute operation.
Thus, I use the scoreboards as tables, which are updated for each MCInst (or MCInstrDesc).

Here I am pre-assume ‘.s input’ is scheduled by programmer and I need not worry about the correctness of schedule.

I worry about the register port availability only.

I am considering to use this hazard recognizer to check for whether ports are available for given MCInst execution on a FU.

I am not clear what pitfalls I may land in using this approach ? Or should this approach work good ?

Do I need to worry about ‘forward execution latency’ of next instruction, if it is greater than 1 ? But I am not dealing with scheduling.

Regards,
Pankaj

Thanks for the valuable suggestion. I have a query further.

As per the inputs, I tried creating a scoreboard hazard recognizer which works on MCInst.
The hazard recognizer functions such as EmitInstruction & getHazardType now work on MCInstDesc instead of SUnit.

This hazard recognizer has additional scoreboards (apart from using reserved & required scoreboards), to track ’ register file’s read and write ports availability’ for FU to execute an operation which has register operands.

These additional scoreboards are updated, when ports are, ‘available and should be reserved’ for FU, to execute operation.
Thus, I use the scoreboards as tables, which are updated for each MCInst (or MCInstrDesc).

Here I am pre-assume ‘.s input’ is scheduled by programmer and I need not worry about the correctness of schedule.

I worry about the register port availability only.

I am considering to use this hazard recognizer to check for whether ports are available for given MCInst execution on a FU.

I am not clear what pitfalls I may land in using this approach ? Or should this approach work good ?

Do I need to worry about ‘forward execution latency’ of next instruction, if it is greater than 1 ? But I am not dealing with scheduling.

You approach to creating an MC API sounds fine.
If some ports are reserved and the machine stalls because of latency, then those ports may be free by the time the next instructions issue.

-Andy