[GSoC 2016] Need more info on Add a MachineModulePass

Hello,

Probably this may be too late to start thinking about this project but I think this is particularly useful feature for LLVM. A quick use I can think of this is Implementing Inter-procedural Register Allocation ( for Research purpose ).

I have start looking at the code for MachineFunctionPass, I think currently MachineModule class is not available ( the project work will include that ) but trying to find out required details to first create a MachineModule class which holds references to required information. I am also trying to figure out what are the things should compose MachineModule class ( some sort of analogy with Module class used for IR passes)

After that I think next step is to extend the ModulePass and let ModulePass execute optimization provide enough information.

Am I going in correct direction?
Please provide some pointers.

Sincerely,

Hi Vivek,

Hello,

Probably this may be too late to start thinking about this project but I think this is particularly useful feature for LLVM.

+1. I’d like to use something this feature in GlobalISel. The idea is that has as soon as we lowered the LLVM IR of the whole module to MachineInstr, the LLVM IR should be deallocable.
In other words, the MachineModule/MachineFunctions should contain enough information such that we do not have to keep the LLVM IR around.

One of the main challenge is about alias analysis information that are tight with LLVM IR, but may be used in MachineFunctionPass.

A quick use I can think of this is Implementing Inter-procedural Register Allocation ( for Research purpose ).

I have start looking at the code for MachineFunctionPass, I think currently MachineModule class is not available ( the project work will include that ) but trying to find out required details to first create a MachineModule class which holds references to required information. I am also trying to figure out what are the things should compose MachineModule class ( some sort of analogy with Module class used for IR passes)

At least for GlobalISel, we would need a way to create and get the global variables of a Module, but lowered to MachineInstr (or MC) level.

After that I think next step is to extend the ModulePass and let ModulePass execute optimization provide enough information.

+1

Am I going in correct direction?

Yes, at least, it makes sense to me.

Please provide some pointers.

Other than gathering feedback on what is needed and trying, I unfortunately cannot offer anything else.

Cheers,
-Quentin

*Vivek Pandya*

Hi Vivek,

Hello,

Probably this may be too late to start thinking about this project but I
think this is particularly useful feature for LLVM.

+1. I’d like to use something this feature in GlobalISel. The idea is that
has as soon as we lowered the LLVM IR of the whole module to MachineInstr,
the LLVM IR should be deallocable.
In other words, the MachineModule/MachineFunctions should contain enough
information such that we do not have to keep the LLVM IR around.

One of the main challenge is about alias analysis information that are
tight with LLVM IR, but may be used in MachineFunctionPass.

A quick use I can think of this is Implementing Inter-procedural Register
Allocation ( for Research purpose ).

I have start looking at the code for MachineFunctionPass, I think
currently MachineModule class is not available ( the project work will
include that ) but trying to find out required details to first create a
MachineModule class which holds references to required information. I am
also trying to figure out what are the things should compose MachineModule
class ( some sort of analogy with Module class used for IR passes)

At least for GlobalISel, we would need a way to create and get the global
variables of a Module, but lowered to MachineInstr (or MC) level.

Looking at the current implementation of the Module class, it seems that

Module class maintains Linked List to GlobalVariables, Functions,
GlobalAliasis ... and meta informations about the Module like file name ,
Module ID , Target Triple DataLayout etc. And currently for
MachineFunction pass MachineFunctionAnalysis creates object of
MachineFunction. The MachineFunction constructor takes Function,
TargetMachine etc as arguments and creates MachineBasicBlocks and thus
provides access to MachineInstr.

Now I have very rough idea to inherit the Module Pass and use the
information available in Module class to create MachineModule class.
MachineModule class should use MachineFunctionAnalysis to create
MachineFunction for each Function F in Module class MachineModule class
should create links between all these MachineFunction object as per the
original linked list of Function objects. Similarly we need to get MC
lowered representation for each features of Module class when it is
possible and we should also preserve some IR Analysis that are useful at
later stage.
Then we provide various method to use this information (access
GlobalVariable etc) as similar to Module class. Once we have MachineModule
class ready we can use this to execute user specific operation on each
function or provide user iterators for MachineFunction list.

I think I need to look at how LLVM IR is lowered to MachineIR to understand
this better.
Please provide some more info or analysis so that I can build a reasonable
and promising work plan for this.

Sincerely,
Vivek

Hi Vivek,

Hello,

Probably this may be too late to start thinking about this project but I think this is particularly useful feature for LLVM.

+1. I’d like to use something this feature in GlobalISel. The idea is that has as soon as we lowered the LLVM IR of the whole module to MachineInstr, the LLVM IR should be deallocable.
In other words, the MachineModule/MachineFunctions should contain enough information such that we do not have to keep the LLVM IR around.

One of the main challenge is about alias analysis information that are tight with LLVM IR, but may be used in MachineFunctionPass.

A quick use I can think of this is Implementing Inter-procedural Register Allocation ( for Research purpose ).

I have start looking at the code for MachineFunctionPass, I think currently MachineModule class is not available ( the project work will include that ) but trying to find out required details to first create a MachineModule class which holds references to required information. I am also trying to figure out what are the things should compose MachineModule class ( some sort of analogy with Module class used for IR passes)

At least for GlobalISel, we would need a way to create and get the global variables of a Module, but lowered to MachineInstr (or MC) level.

Looking at the current implementation of the Module class, it seems that Module class maintains Linked List to GlobalVariables, Functions, GlobalAliasis … and meta informations about the Module like file name , Module ID , Target Triple DataLayout etc. And currently for MachineFunction pass MachineFunctionAnalysis creates object of MachineFunction. The MachineFunction constructor takes Function, TargetMachine etc as arguments and creates MachineBasicBlocks and thus provides access to MachineInstr.

Now I have very rough idea to inherit the Module Pass and use the information available in Module class to create MachineModule class. MachineModule class should use MachineFunctionAnalysis to create MachineFunction for each Function F in Module class MachineModule class should create links between all these MachineFunction object as per the original linked list of Function objects. Similarly we need to get MC lowered representation for each features of Module class when it is possible and we should also preserve some IR Analysis that are useful at later stage.
Then we provide various method to use this information (access GlobalVariable etc) as similar to Module class. Once we have MachineModule class ready we can use this to execute user specific operation on each function or provide user iterators for MachineFunction list.

I think I need to look at how LLVM IR is lowered to MachineIR to understand this better.

This part is the job of instruction selection. The machine instruction after ISel are pretty much standalone modulo some back link to the llvm ir for the memory addresses and such. Those same back links are used to query the alias analysis.
The link to the Function from the MachineFunction is here to provide an access to Function attributes and other utilities. I believe we could transfer the ownership of those things as par of the lowering.

*Vivek Pandya*

*Vivek Pandya*

Hi Vivek,

Hello,

Probably this may be too late to start thinking about this project but I
think this is particularly useful feature for LLVM.

+1. I’d like to use something this feature in GlobalISel. The idea is
that has as soon as we lowered the LLVM IR of the whole module to
MachineInstr, the LLVM IR should be deallocable.
In other words, the MachineModule/MachineFunctions should contain enough
information such that we do not have to keep the LLVM IR around.

One of the main challenge is about alias analysis information that are
tight with LLVM IR, but may be used in MachineFunctionPass.

A quick use I can think of this is Implementing Inter-procedural Register
Allocation ( for Research purpose ).

I have start looking at the code for MachineFunctionPass, I think
currently MachineModule class is not available ( the project work will
include that ) but trying to find out required details to first create a
MachineModule class which holds references to required information. I am
also trying to figure out what are the things should compose MachineModule
class ( some sort of analogy with Module class used for IR passes)

At least for GlobalISel, we would need a way to create and get the global
variables of a Module, but lowered to MachineInstr (or MC) level.

Looking at the current implementation of the Module class, it seems that

Module class maintains Linked List to GlobalVariables, Functions,
GlobalAliasis ... and meta informations about the Module like file name ,
Module ID , Target Triple DataLayout etc. And currently for
MachineFunction pass MachineFunctionAnalysis creates object of
MachineFunction. The MachineFunction constructor takes Function,
TargetMachine etc as arguments and creates MachineBasicBlocks and thus
provides access to MachineInstr.

Now I have very rough idea to inherit the Module Pass and use the
information available in Module class to create MachineModule class.
MachineModule class should use MachineFunctionAnalysis to create
MachineFunction for each Function F in Module class MachineModule class
should create links between all these MachineFunction object as per the
original linked list of Function objects. Similarly we need to get MC
lowered representation for each features of Module class when it is
possible and we should also preserve some IR Analysis that are useful at
later stage.
Then we provide various method to use this information (access
GlobalVariable etc) as similar to Module class. Once we have MachineModule
class ready we can use this to execute user specific operation on each
function or provide user iterators for MachineFunction list.

I think I need to look at how LLVM IR is lowered to MachineIR to
understand this better.

This part is the job of instruction selection. The machine instruction
after ISel are pretty much standalone modulo some back link to the llvm ir
for the memory addresses and such. Those same back links are used to query
the alias analysis.
The link to the Function from the MachineFunction is here to provide an
access to Function attributes and other utilities. I believe we could
transfer the ownership of those things as par of the lowering.

I found some great blog post on this topic
http://eli.thegreenplace.net/2012/11/24/life-of-an-instruction-in-llvm
Quentin Do you think the above mentioned plan is reasonable and it would
work ?

I think this is a separate issue from having a MachineModulePass. My goal in having a MachineModulePass is to be able to do inter-procedural analysis and transformation on a program after its code has been generated. Examples of such applications include inter-procedural register allocation, inter-procedural instruction selection, inter-procedural code layout optimization (I have a colleague working on this using reference affinity theory), and inter-procedural analysis of machine code for measuring the efficacy of compiler transformations for security. Whether one keeps the LLVM IR around or not is orthogonal. What is needed is a way of being able to examine the whole program without having to worry about another transformation running in parallel (if you use a MachineFunctionPass, as I understand it, the pass should not be analyzing/transforming other MachineFunction’s). One of the other projects I proposed is to encode this information within the MachineInstr IR. This would be very useful for one of my projects, and it might solve this problem as well (once the information is encoded in the MachineInstr IR, you don’t need the LLVM IR around anymore). Regards, John Criswell

I think this is a separate issue from having a MachineModulePass. My goal in having a MachineModulePass is to be able to do inter-procedural analysis and transformation on a program after its code has been generated. Examples of such applications include inter-procedural register allocation, inter-procedural instruction selection, inter-procedural code layout optimization (I have a colleague working on this using reference affinity theory), and inter-procedural analysis of machine code for measuring the efficacy of compiler transformations for security. Whether one keeps the LLVM IR around or not is orthogonal. What is needed is a way of being able to examine the whole program without having to worry about another transformation running in parallel (if you use a MachineFunctionPass, as I understand it, the pass should not be analyzing/transforming other MachineFunction’s).

I agree. I was exposing my use case to help designing the whole thing. Yes, it does not have to be addressed for the MachineModule to be useable for other things.

One of the main challenge is about alias analysis information that are tight with LLVM IR, but may be used in MachineFunctionPass.

One of the other projects I proposed is to encode this information within the MachineInstr IR. This would be very useful for one of my projects, and it might solve this problem as well (once the information is encoded in the MachineInstr IR, you don’t need the LLVM IR around anymore).

That sounds like a possible solution. Do you have a link to the thread where this has been discussed?

Thanks,
-Quentin

Hi Vivek,

From a high level point of view, that looks reasonable.

Cheers,
-Quentin

It’s a project idea that I added to the Open Projects page. I don’t think I’ve seen anyone volunteering to work on it on the mailing lists. Regards, John Criswell