[Attributor] Questions on LLVM's Attributor

Hello everyone,

I’m a PhD student currently evaluating whether the Attributor could be useful for the research in our group.
At the moment, there are two main requirements that I’m not sure Attributor covers, so any input would be much appreciated.

The first requirement is (one level) context sensitivity.
In the 2019 Attributor introduction talk at 31:40 (2019 LLVM Developers’ Meeting: J. Doerfert “The Attributor: A Versatile Inter-procedural Fixpoint..” - YouTube), Johannes Doerfert mentioned that the Attributor is not context sensitive (yet).
This was some time ago, though, so I’m wondering if this has changed in the meantime?

We would also like to use the Attributor in our own middle-end pass to deduce facts, and adapt the IR based on those facts.
We will need to do this iteratively, though, and we don’t want to run Attributor’s analyses from scratch every time, but instead just “patch” the analysis results when the IR changes.
The second requirement is hence incrementality.
In the 2019 LLVM IPO panel at 13:30 (2019 LLVM Developers’ Meeting: “Panel: Inter-procedural Optimization (IPO) ” - YouTube), Johannes Doerfert mentioned that the Attributor is not explicitly incremental yet, but that the necessary information (i.e. the dependency graph of abstract attributes) is available.

I did some initial testing with my own abstract attribute that implements a basic forward data flow analysis, i.e. the state of the abstract attribute at each IR position depends on its predecessor(s).
I reused the dependency graph that’s built automatically (through the Deps member of AbstractAttributes).
Whenever a new instruction is added to the IR, I remove its successor and that successor’s (transitive) dependencies from the AAMap, so that they are recomputed when the Attributor is run the next time.
That seems to work for my own attribute, but I’m wondering if anyone has any tips to add support for incrementality to the existing attributes, or other approaches to add incrementality to my own attribute?

Thanks in advance!

FYI, Attributor removes unneeded dependencies from Deps in subsequent iterations so the information from the instructions that you changed might have been used by some AbstractAttributes in the past iterations.
So just deleting transitive deps is probably not enough. You might get some ‘interesting’ bugs in your code.

Thomas Faingnaert via LLVM Discussion Forums <llvm@discoursemail.com>, 18 Mar 2022 Cum, 13:29 tarihinde şunu yazdı:

We have support for call site contexts now. @kuterd implemented that a while ago. I don’t think we use it a lot but it is there. See call base context in the IRPosition.

That is what it is meant for. OpenMP-Opt and AMD GPU backend use it in their passes as a utility too.

So, this is trickier. I would generally recommend against “patching” except if you really know what you are doing. Touching the IR is something that can easily confuse and mess up things, from the Attributor internals to AA results. However, it will heavily depend on the kind of modifications and the AAs you are using. I’m happy to have a more concrete conversation (also offline) if that helps. With the little information here it is hard to say if it will work or if you should consider “virtual transformations” instead. I mean, AAValueSimplify is used to replace a value with something else though we don’t modify the IR. Still users of common routines will get the “simplified” value. Similarly, if you iterate over all uses of a value the Attributor callback will “look through memory”, if possible, and give you uses of a load which might read the value you are looking at. Again, it really depends on the modifications in order to determine if it is reasonable to keep the Attributor instance around and reuse it.

In either case, I’d be happy to help you use it, feel free to reach out :slight_smile:

~ Johannes

FYI, Attributor removes unneeded dependencies from Deps in subsequent iterations so the information from the instructions that you changed might have been used by some AbstractAttributes in the past iterations.

Thanks for pointing that out! I suppose the files I tested with were simple enough to only require 1 fixpoint iteration, so I didn’t notice this.
Could you clarify what you mean by “unneeded” dependencies and/or point me to the location in the source code where this removal happens?

Hi Johannes,

Thanks for your swift response!
I’ve discussed this with my group over the course of last week.

We will transform the IR in three ways.
First and second, our pass will add new local variables (i.e. allocas in the entry block) and global variables.
Third, our pass will insert call instructions, which I suppose will be the biggest problem, since it changes the callgraph and hence the interprocedural control flow.
The arguments of these calls might be these newly inserted local and global variables, but also existing values in the IR.

Since we’re writing a tool which transforms code provided by the user, we don’t really have any control over the functions we might call.
We hence can’t make any assumptions about these functions, though in general we can state that they will be non-pure, i.e. they will have side effects on e.g. existing or newly added global variables.
It would be possible though to restrict the user in some sense, e.g. requiring that these functions are norecurse, if that is useful.

As for which AAs are interesting, there are two categories.
The most important category are the attributes that give information on the possible values of IR values, and how they relate to the possible values of other IR values, i.e. AAReturnedValues, AANonNull, AANoAlias, AAValueSimplify, AAValueConstantRange, and AAPotentialValues.
Another category of interest to us are the attributes related to intra- and interprocedural control flow, i.e. AAReachability, AACallEdges, and AAFunctionReachability.
Also, AFAICT, the update methods for all AAs only visit live instructions, which means all AAs implicitly depend on liveness analysis, so I suppose we’ll need incrementality for AAIsDead as well.

I understand that incrementality is a big ask, and we’ve also been thinking about just dropping it as an absolute requirement, so it’s OK if the Attributor is not (yet) meant for such incremental analysis.
It’s very likely that we’ll still use the Attributor anyway, as it provides a nice framework for fixpoint dataflow analyses, and also contains AAs for the most important static analyses.

Out of interest, do you have an idea on the current state of incrementality in the LLVM middle-end outside of the Attributor?
Are all analyses just rerun everytime the IR changes, or are the results patched in some way?

Kind regards,
Thomas

Yeah, these are almost all the things one can use :wink:

So, adding new calls to arbitrary functions, as opposed to some well specified tracing or logging methods, will not interact with the AAs well. I’m not sure how it could, short of keeping track of all dependences (also from the past). Even then, we fix AAs once we know they can’t change anymore, this won’t work with this scheme either.

I think you have 2 options:

  1. Introduce sufficient uncertainty such that your changes are always “a subset” of the uncertainty. This again depends on what you want to do exactly but a call to an unknown function at the function entry could allow you to model potential future call edges. However, this trick will not work for everything and even stop working for call edges as we learn that we know all callers of internal functions (w/o their address taken).
  2. Run the Attributor multiple times. I know that is not what you wanted but w/o trying it you should not assume you cannot be OK with the result. The Attributor is not “fast” or “tuned” right now, so even if it’s too slow, we could probably get much better with little tuning effort.

Now that I said this, you should also ensure that your modifications do not violate existing annotations. E.g., if you analyze the code, put attributes in certain places, then modify the code, the attributes better still hold (or are removed).

I fear the devil is in the detail of your transformations and it’s hard to answer here given the limited information. I’m still open to a call if that would help you.

Nothing I know of really does this, or at least not in the way I assume you are looking for. Most passes are relatively straight forward and narrow: 1) analyze, 2) act on it.
Attributor is not different except in the steps just “more broad”.

FWIW, adding stuff like this is generally OK.

As mentioned before, it depends on the call target and arguments. Passing the new stuff and some values to a “logger” function with known effects that does not impact the program semantics otherwise: doable; passing some values to some user function or similer: very hard.

When you call getOrCreateAAFor in the updateImpl method of an attribute a dependency gets recorded to the Deps vector of the attribute you asked for.

recordDependence
https://llvm.org/doxygen/Attributor_8cpp_source.html#l02734

We reset the Deps vector of AAs each iteration in line 1671.
https://llvm.org/doxygen/Attributor_8cpp_source.html#l01671

So an AA only exists on the Deps vector of another attribute if a dependency got recorded in the last iteration.

  1. updateImpl function is ‘free’ to do what it wants, so it might not record a dependency again.
  2. We don’t record dependencies to attributes that reached a fixpoint state.

So what I mean is that an AA might have affected the states of other AAs that are not in it’s Deps vector.

I spouse you could modify the Attributor to also record past deps and reset those as well but it’s very tricky to get this right, there could be other problems.

To be clear, in the last iteration of “another attribute”. So X depends on Y in an iteration. Y changes → X is updated. In this iteration X does not depend on Y. Y changes, X is not updated.