How to keep FunctionPass analysis result alive in Module Pass?

Hello,
  I am trying to write a new ModulePass using LoopInfo analysis result, but it seems I misunderstand some concept about PassManager. Basically I want to keep LoopInfo analysis result alive. Here is an example showing the problem I encountered, assuming I already addRequired<llvm::LoopInfo>() in getAnalysisUsage:
  
  void foo(llvm::Function *F1, llvm::Function *F2) {
    llvm::LoopInfo *LI1, LI2;
    LI1 = &getAnalysis<llvm::LoopInfo>(*F1);
    llvm::Loop* L1 = LI1->getLoopFor(F1->begin());
    LI2 = &getAnalysis<llvm::LoopInfo>(*F2);
    llvm::Loop* L2 = LI2->getLoopFor(F2->begin());
    L1->dump(); // crash
    L2->dump();
  }

  I checked why this program crashes. It is because the getAnalysis returns same LoopInfo instance. Each time it clears previous results and run it on the new function. Thus it invalidate the pointer L1 after calling &getAnalysis<llvm::LoopInfo>(*F2).
  
  My questions is whether there is a way to get around this, and to keep the analysis result of Function Pass of all functions alive during my Module Pass? I am using LLVM-3.1-svn version. I would really appreciate your help!

Best,
Fan

Hello,
	I am trying to write a new ModulePass using LoopInfo analysis result, but it seems I misunderstand some concept about PassManager. Basically I want to keep LoopInfo analysis result alive. Here is an example showing the problem I encountered, assuming I already addRequired<llvm::LoopInfo>() in getAnalysisUsage:
	
	void foo(llvm::Function *F1, llvm::Function *F2) {
		llvm::LoopInfo *LI1, LI2;
		LI1 = &getAnalysis<llvm::LoopInfo>(*F1);
		llvm::Loop* L1 = LI1->getLoopFor(F1->begin());
		LI2 = &getAnalysis<llvm::LoopInfo>(*F2);
		llvm::Loop* L2 = LI2->getLoopFor(F2->begin());
		L1->dump();  // crash
		L2->dump();
	}

	I checked why this program crashes. It is because the getAnalysis returns same LoopInfo instance. Each time it clears previous results and run it on the new function. Thus it invalidate the pointer L1 after calling &getAnalysis<llvm::LoopInfo>(*F2).

To the best of my knowledge, the LLVM pass manager never preserves a FunctionPass analysis that is requested by a ModulePass; every time you call getAnalysis for a function, the FunctionPass is re-run.

	
	My questions is whether there is a way to get around this, and to keep the analysis result of Function Pass of all functions alive during my Module Pass? I am using LLVM-3.1-svn version. I would really appreciate your help!

The trick I’ve used is to structure the code so that getAnalysis<>() is only called once per function. For example, your ModulePass can have a std::map that maps between Function * and LoopInfo *. You then provide a method getLoopInfo(Function * F) that checks to see if F is in the map. If it is, it returns what is in the map. If it isn’t, it calls getAnalysis on F, stores the result in the map, and returns the LoopInfo pointer.

This is important not only for functionality (in your case) but also for performance; you don’t want to calculate an analysis twice for the same function.

– John T.

Thank you for your quick reply.

Actually I am using a std::map to map Function* to LoopInfo*, but that does not help in this case. Each time I call getAnalysisllvm::LoopInfo(*F), it returns the same instance of llvm::LoopInfo, so the std::map is just mapping every function into the same instance. It seems only the analysis result for the last function is valid, because all the result for all previous functions are erased.

The only workaround solution I have now is to copy all analysis result out of the data structure of LoopInfo before I call next &getAnalysis(). Because llvm::LoopInfo does not provide copy method, this will be very dirty to do so.

Best,
Fan

Thank you for your quick reply.

Actually I am using a std::map to map Function* to LoopInfo*, but that does not help in this case. Each time I call getAnalysisllvm::LoopInfo(*F), it returns the same instance of llvm::LoopInfo, so the std::map is just mapping every function into the same instance. It seems only the analysis result for the last function is valid, because all the result for all previous functions are erased.

Just to make sure I understand: you are saying that every time you call getAnalysis(), you get the same LoopInfo * regardless of whether you call it on the same function or on a different function. Is that correct?

Getting the same LoopInfo * when you call getAnalysis<> on the same function twice would not surprise me. Getting the same LoopInfo * when you call getAnalysis on F1 and F2 where F1 and F2 are different functions would surprise me greatly.

The only workaround solution I have now is to copy all analysis result out of the data structure of LoopInfo before I call next &getAnalysis(). Because llvm::LoopInfo does not provide copy method, this will be very dirty to do so.

Yes, that may be what you have to do.

– John T.

This surprises me too. Here is the real code from my module pass:

89 bool SymbolicDataflow::runOnModule(llvm::Module &M) {
90 // Init per module goes here
91 AA = &getAnalysisllvm::AliasAnalysis();
92 LIs.clear();
93 DTs.clear();
94 for (llvm::Module::iterator it = M.begin(); it != M.end(); ++it) {
95 llvm::Function *F = &*it;
96 if (!F->isDeclaration()) {
97 llvm::LoopInfo *LI = &getAnalysisllvm::LoopInfo(*F);
98 llvm::DominatorTree *DT = &getAnalysisllvm::DominatorTree(*F);
99 LIs[F] = LI;
100 DTs[F] = DT;
101 DEBUG(llvm::errs() << "PASS INIT " << LI << " " << DT << " " << F->getName() << “\n”);
102 }
103 }
……

It prints out the poiner value of each instance, and it is same for all Function… At least on my machine…

Best,
Fan

Hi John & Fan,

I hit the exact same problem today. I can confirm that Fan’s observation of getting the same LoopInfo* from subsequent calls to getAnalysis(function) for distinct functions is indeed true.

I was very surprised by this at first as well, but I think I’ve found an explanation - please anyone correct me if this is wrong:

What you’re getting from getAnalysis<>(function) is a reference to the function pass after it has been run on the specified function. While you can run a function pass on many different functions, there still exists only one instance of the pass itself. The only thing that changes between different calls to getAnalysis(F) is the analysis information held by the LoopInfo pass in its LoopInfoBase member. It gets released and overwritten on every call to LoopInfo::runOnFunction() - see the call to releaseMemory() right at the beginning.

The idea of creating some sort of Map of Function* ----> LoopInfo* therefore won’t work. It also doesn’t make sense to keep Loop* pointers around after getAnalysis() has been called again because all that memory gets released (which is how I hit this problem)…

Now, Fan, the practical consequence of this is that if you want to use LoopInfo in a ModulePass, you either have to do all your work that uses LoopInfo in between getAnalysis calls (if that’s possible you’re probably better off writing a FunctionPass in the first place) OR keep re-running getAnalysis which is very inefficient. I’d imagine the same goes for DominatorTree.

In general, it would be nice if there was some logical separation between a Function Pass and the Analysis Information it produces. For LoopInfo, it’s kind of there since all the data is in this LoopInfoBase object but there is no way of taking ownership of that…

– Tobias

Hi John & Fan,

I hit the exact same problem today. I can confirm that Fan’s observation of getting the same LoopInfo* from subsequent calls to getAnalysis(function) for distinct functions is indeed true.

I was very surprised by this at first as well, but I think I’ve found an explanation - please anyone correct me if this is wrong:

What you’re getting from getAnalysis<>(function) is a reference to the function pass after it has been run on the specified function. While you can run a function pass on many different functions, there still exists only one instance of the pass itself. The only thing that changes between different calls to getAnalysis(F) is the analysis information held by the LoopInfo pass in its LoopInfoBase member. It gets released and overwritten on every call to LoopInfo::runOnFunction() - see the call to releaseMemory() right at the beginning.

That seems like a reasonable explanation.

The idea of creating some sort of Map of Function* ----> LoopInfo* therefore won’t work. It also doesn’t make sense to keep Loop* pointers around after getAnalysis() has been called again because all that memory gets released (which is how I hit this problem)…

Now, Fan, the practical consequence of this is that if you want to use LoopInfo in a ModulePass, you either have to do all your work that uses LoopInfo in between getAnalysis calls (if that’s possible you’re probably better off writing a FunctionPass in the first place) OR keep re-running getAnalysis which is very inefficient. I’d imagine the same goes for DominatorTree.

In general, it would be nice if there was some logical separation between a Function Pass and the Analysis Information it produces. For LoopInfo, it’s kind of there since all the data is in this LoopInfoBase object but there is no way of taking ownership of that…

Can’t you just copy the analysis results out of LoopInfo as Fan suggested? I would think that if you can query it, you can copy it.

– John T.

Hi John,

glad the explanation made sense :slight_smile:

[…]

Can’t you just copy the analysis results out of LoopInfo as Fan suggested? I would think that if you can query it, you can copy it.

Well, LoopInfoBase has a private copy constructor (for good reasons) so you can’t just copy the entire thing out. The same goes for the Loop class. At the end of the day, you just have to do the actual work already at this point which you were going to do with the analysis results later on. That’s possible in my case (although it means I have to restructure my module pass quite a bit), but it might not always be?

– Tobias