New Pass Manager and CGSSCPassManager

Hi all,

I've run into a sticky situation with CGSSCPassManager. I have a module
pass that needs to run after all inlining has occurred but *before*
loops have been optimized significantly. I'm not sure this is possible
with the way CGSCCPassManager is formulated, at least not without
hackery.

My pass has to be a module pass or a CGSCC pass because it has to add
global variables and function declarations. It does not otherwise modify
global state or the call graph SCC.

Incidentally WRT my pass, this page about pass requirements seems
ambiguous:

http://llvm.org/docs/WritingAnLLVMPass.html

  To be explicit, FunctionPass subclasses are not allowed to:

  1. Inspect or modify a Function other than the one currently being processed.
  2. Add or remove Functions from the current Module.
  3. Add or remove global variables from the current Module.
  4. Maintain state across invocations of runOnFunction (including global data).

#2 is ambiguous to me. Does "add or remove Functions" mean definitions
only or both definitions and declarations? It might be helpful to
clarify that in this section of the document.

As I understand things, CGSCCPassManager is designed to run things in a
bottom-up manner:

  // #1
  for scc in scc_list {
    inline callees everywhere
    do other CG passes
    for function in bottom_up(scc) {
      run function passes
    }
  }

Have I got that right? I have some questions below about this general
structure that are only tangentially related to my main issue.

I need to be able to do something like this:

  // #2
  for scc in scc_list {
    inline callees everywhere
    run my pass
    do other CG passes
    for function in bottom_up(scc) {
      run function passes
    }
  }

It doesn't seem possible currently because there's no adaptor to run a
module pass inside CGSCCPassManager. You can run a CGSCCPassManager
inside a ModulePassManager but not the other way around. It kind of
makes sense because a call graph SCC doesn't necessarily contain all of
the Functions in a Module. On the other hand, my pass doesn't really
care that it might not see all Functions in the Module in a single
invocation. It would eventually see them all as CGSCCPassManager
processes additional SCCs.

I suppose I could make my pass a CGSCC pass but that seems like overkill
for my purposes. Indeed, I had no need to do this with the Old Pass
Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager.

When I first looked into this I expected the inliner SCC algorithm to
work something like this:

  // #3
  for scc in scc_list {
    for function in bottom_up(scc) {
      inline callees into function
      run function passes
    }
    do other CG passes
  }

But it apparently doesn't work that way. If it did I would be in really
bad shape because there would be no way to run my pass after all
inlining has occurred but before loops have been significantly altered.
It is functionally incorrect for my pass to modify a Function B and have
B inlined into the Function A which my pass also modifies.

Another option would be to split the current CGSCC pass pipeline in two,
creating one pipeline for things to run before my pass and another for
things to run after my pass. But upstream is definitely not interested
in my pass so this would be a downstream change and rather burdensome to
maintain.

Now for the additional questions about CGSCCPassManager mentioned above.

From the pseudocode #1 above, it looks like all inlining happens before

any optimization. This seems sub-optimal to me because transformations
may make Functions good inline candidates when they were not previously.
Is this a know issue with the current setup? I'm kind of glad it works
like #1 (if indeed it does) because it at least makes my goal
theoretically attainable. But another part of me really wants it to
work like pseudocode #3 because it seems better for optimization.

Thanks for all insights and help!

              -David

Hi all,

I’ve run into a sticky situation with CGSSCPassManager. I have a module
pass that needs to run after all inlining has occurred but before
loops have been optimized significantly. I’m not sure this is possible
with the way CGSCCPassManager is formulated, at least not without
hackery.

Could you explain what your pass does and why it needs to be where it needs to be?

My pass has to be a module pass or a CGSCC pass because it has to add
global variables and function declarations. It does not otherwise modify
global state or the call graph SCC.

Incidentally WRT my pass, this page about pass requirements seems
ambiguous:

http://llvm.org/docs/WritingAnLLVMPass.html

To be explicit, FunctionPass subclasses are not allowed to:

  1. Inspect or modify a Function other than the one currently being processed.
  2. Add or remove Functions from the current Module.
  3. Add or remove global variables from the current Module.
  4. Maintain state across invocations of runOnFunction (including global data).

#2 is ambiguous to me. Does “add or remove Functions” mean definitions
only or both definitions and declarations? It might be helpful to
clarify that in this section of the document.

At least for the NPM, it was designed with potential future concurrency in mind. Modifying the list of functions in a module, even just declarations, could mess with that. http://llvm.org/docs/WritingAnLLVMPass.html is more of a legacy PM tutorial. I started on http://llvm.org/docs/WritingAnLLVMNewPMPass.html for the NPM, I can clarify that there.

As I understand things, CGSCCPassManager is designed to run things in a
bottom-up manner:

// #1
for scc in scc_list {
inline callees everywhere
do other CG passes
for function in bottom_up(scc) {
run function passes
}
}

Have I got that right? I have some questions below about this general
structure that are only tangentially related to my main issue.

The inliner inlines calls within the function, it doesn’t look at callers of the current function. A CGSCC pass shouldn’t look at anything above the current SCC. As you mentioned below, this is what makes callers see the most optimized version of these functions when deciding to inline or not.

I need to be able to do something like this:

// #2
for scc in scc_list {
inline callees everywhere
run my pass
do other CG passes
for function in bottom_up(scc) {
run function passes
}
}

It doesn’t seem possible currently because there’s no adaptor to run a
module pass inside CGSCCPassManager. You can run a CGSCCPassManager
inside a ModulePassManager but not the other way around. It kind of
makes sense because a call graph SCC doesn’t necessarily contain all of
the Functions in a Module. On the other hand, my pass doesn’t really
care that it might not see all Functions in the Module in a single
invocation. It would eventually see them all as CGSCCPassManager
processes additional SCCs.

I suppose I could make my pass a CGSCC pass but that seems like overkill
for my purposes. Indeed, I had no need to do this with the Old Pass
Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager.

It doesn’t really make sense to run a module pass multiple times because of the number of SCCs/functions. A module pass should just do everything it needs to do once and be done.

Hi all,

I’ve run into a sticky situation with CGSSCPassManager. I have a module
pass that needs to run after all inlining has occurred but before
loops have been optimized significantly. I’m not sure this is possible
with the way CGSCCPassManager is formulated, at least not without
hackery.

Could you explain what your pass does and why it needs to be where it needs to be?

My pass has to be a module pass or a CGSCC pass because it has to add
global variables and function declarations. It does not otherwise modify
global state or the call graph SCC.

Incidentally WRT my pass, this page about pass requirements seems
ambiguous:

http://llvm.org/docs/WritingAnLLVMPass.html

To be explicit, FunctionPass subclasses are not allowed to:

  1. Inspect or modify a Function other than the one currently being processed.
  2. Add or remove Functions from the current Module.
  3. Add or remove global variables from the current Module.
  4. Maintain state across invocations of runOnFunction (including global data).

#2 is ambiguous to me. Does “add or remove Functions” mean definitions
only or both definitions and declarations? It might be helpful to
clarify that in this section of the document.

At least for the NPM, it was designed with potential future concurrency in mind. Modifying the list of functions in a module, even just declarations, could mess with that. http://llvm.org/docs/WritingAnLLVMPass.html is more of a legacy PM tutorial. I started on http://llvm.org/docs/WritingAnLLVMNewPMPass.html for the NPM, I can clarify that there.

As I understand things, CGSCCPassManager is designed to run things in a
bottom-up manner:

// #1
for scc in scc_list {
inline callees everywhere
do other CG passes
for function in bottom_up(scc) {
run function passes
}
}

Have I got that right? I have some questions below about this general
structure that are only tangentially related to my main issue.

The inliner inlines calls within the function, it doesn’t look at callers of the current function. A CGSCC pass shouldn’t look at anything above the current SCC. As you mentioned below, this is what makes callers see the most optimized version of these functions when deciding to inline or not.

(may be nit) not quite: see “shouldBeDeferred”, where the cost of inlining the current caller into its callers is evaluated.

Arthur Eubanks <aeubanks@google.com> writes:

I've run into a sticky situation with CGSSCPassManager. I have a module
pass that needs to run after all inlining has occurred but *before*
loops have been optimized significantly. I'm not sure this is possible
with the way CGSCCPassManager is formulated, at least not without
hackery.

Could you explain what your pass does and why it needs to be where it
needs to be?

Unfortunately I'm not sure I can due to IP issues. I can probably say
that in addition to the correctness restriction about not inlining a
processed function into another processed function, it wants to see
loops as close to the original source form as possible. I can't really
ease that restriction as other clients I don't control rely on it.

I can maybe work around the correctness issue by doing a post-SCC
cleanup to fix up problem functions. But that would be almost as
complicated as the pass itself and so would be best avoided.

At least for the NPM, it was designed with potential future concurrency in
mind. Modifying the list of functions in a module, even just declarations,
could mess with that.

I can clarify that there.

That would be great, thanks!

The inliner inlines calls within the function, it doesn't look at callers
of the current function. A CGSCC pass shouldn't look at anything above the
current SCC. As you mentioned below, this is what makes callers see the
most optimized version of these functions when deciding to inline or not.

Ah, good point. It would do all inlining within an SCC, which I'd guess
is usually pretty small. If inlining happens across SCCs that could be
trouble for me. Reading the code, it's not entirely clear whether that
is possible. I guess as inlining proceeds the SCC being processed may
become small enough that it can be subsumed into some other SCC. In
fact it's rather likely in many cases.

I suppose I could make my pass a CGSCC pass but that seems like overkill
for my purposes. Indeed, I had no need to do this with the Old Pass
Manager as inlining ran in a ModulePassManager, not a CGSCCPassManager.

It doesn't really make sense to run a module pass multiple times because of
the number of SCCs/functions. A module pass should just do everything it
needs to do once and be done.

Got it, that makes perfect sense.

But I'm left with an even worse problem than I had before. :frowning:

I may have to end up disabling various optimizations which would be
unfortunate.

                     -David

David Greene <dag@hpe.com> writes:

It doesn't really make sense to run a module pass multiple times because of
the number of SCCs/functions. A module pass should just do everything it
needs to do once and be done.

Got it, that makes perfect sense.

But I'm left with an even worse problem than I had before. :frowning:

I may have to end up disabling various optimizations which would be
unfortunate.

Just a quick update: I was able to fix the issue by splitting my pass,
one ModulePass that runs before the SCC passes and one after. It works
quite well!

Thank you for your help!

                  -Davids