From: "Pete Cooper" <peter_cooper@apple.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Chandler Carruth" <chandlerc@gmail.com>, "Duncan P. N. Exon
Smith" <dexonsmith@apple.com>, "Philip Reames"
<listmail@philipreames.com>, "Mehdi Amini" <mehdi.amini@apple.com>,
"Rafael Espíndola" <rafael.espindola@gmail.com>, "llvm-dev"
<llvm-dev@lists.llvm.org>, "Quentin Colombet" <qcolombet@apple.com>,
"Eric Christopher" <echristo@gmail.com>
Sent: Thursday, May 12, 2016 6:39:25 PM
Subject: Re: [llvm-dev] Deleting function IR after codegen
> > From: "Chandler Carruth" < chandlerc@gmail.com >
>
> > To: "Quentin Colombet" < qcolombet@apple.com >, "Eric
> > Christopher"
> > <
> > echristo@gmail.com >
>
> > Cc: "Pete Cooper" < peter_cooper@apple.com >, "Duncan P. N. Exon
> > Smith" < dexonsmith@apple.com >, "Hal Finkel" < hfinkel@anl.gov
> > >,
> > "Philip Reames" < listmail@philipreames.com >, "Mehdi Amini" <
> > mehdi.amini@apple.com >, "Rafael Espíndola" <
> > rafael.espindola@gmail.com >, "llvm-dev" <
> > llvm-dev@lists.llvm.org
> > >
>
> > Sent: Thursday, May 12, 2016 6:11:34 PM
>
> > Subject: Re: [llvm-dev] Deleting function IR after codegen
>
> > FWIW, +1 from me as well.
>
> > But I don't think you need to make this a module pass or anything
> > else. I think you should leave the husks of the functions around
> > and
> > just nuke the IR out from under them. That way the module surface
> > remains essentially identical. You can also probably nuke all the
> > instructions from BBs with their addresses taken for jump tables,
> > etc.
>
> I agree. We need to be careful about invalidating inter-procedural
> IR-level analysis results that we make use of using CodeGen (like
> AA). Consider, for example, the following situation:
> 1. We CodeGen a function foo(), and then remove its IR. During this
> process, we never used an AA query that reached CFL-AA
> 2. We CodeGen a function bar(), and bar() calls foo().
> 3. During (2), we make an AA query on instructions in bar(), which
> have MMOs with IR Values in bar()
> 4. CFL-AA is reached and does not have a cached graph for bar(), so
> it builds one
> 5. While building the graph for bar(), it reaches the call to
> foo(),
> and calls tryInterproceduralAnalysis
> 6. CFL-AA does not have a cached graph for foo(), so
> tryInterproceduralAnalysis triggers one to be constructed
> 7. But foo() is now empty, and so has trivial aliasing properties
> 8. We return an incorrect AA result when compiling bar()
Hmm. This is an interesting use case. Is this even legal in the
current pass manager? Codegen is a FunctionPass which I thought
could only look at global variables and its own function.
Or is the rule that a FunctionPass has read/write access to its own
function, but read-only to other functions which is effectively what
you’d need for the CFL-AA case above?
I believe that, technically speaking, this is not legal. CFL-AA should really be a module pass if it wants to do IPA. I think that, at the time it was originally developed, it was made a function pass so that it could work without modifying every other function pass in the pipeline to mark it as preserved. Of course, we later did this anyway for GlobalsAA (a module-level analysis), so we should probably go back and do the same for CFL-AA.
I'm not sure that's the important distinction, however. GlobalsAA pre-computes all necessary data for the module up front (when runOnModule is called), so the scenario I described has no analogy there. If GlobalsAA computed information on Globals lazily (like CFL-AA builds function graphs lazily), it would have the same problem.
I'm not claiming that this is a good thing, I just wanted to point out that it can be a problem given code we have in-tree now.
-Hal