Tracing Value's Transitive Users across Multiple Function Calls in Module Pass

Hello everyone,

I am currently writing a function to trace the usage of a given Value across multiple function calls within an LLVM Module Pass. The function’s goal is to identify all functions that take the given Value as a parameter.

In the process, I have to consider users of the Value, including those who transitively use the original Value as a parameter. I’ve implemented this part successfully with a recursive function. However, I’ve hit a roadblock:

When I identify a function that accepts the Value as a parameter, I also need to examine what happens to that Value inside this new function. Specifically, I need to check if this parameter Value is subsequently passed on to other functions (potentially even external functions).

My first approach was to call the recursive function, using the parameter Value, but this isn’t working as expected. When I pass a Value to a function, the passed Value is different within the receiving function’s scope, correct? If so, how do I access this ‘new’ parameter Value inside the function to continue my recursion?

Here’s an equivalent C code example that my current implementation fails to handle correctly:

// External function
void external_function(int *ptr);

// Non-external function that calls an external function
void non_external_function(int *ptr) {
    external_function(ptr);
}

// Main function
void main() {
    int *ptr = get_pointer_from_not_elsewhere();
    non_external_function(ptr);
    return 0;
}

My current implementation only traces the Value usage within the main function. It identifies non_external_function as the only user of ptr. But it doesn’t trace into non_external_function to identify that ptr is indirectly passed to external_function.

Could anyone advise how I can extend my search to cover this case?

Thanks for your help!

Not sure what your end goal is, but the Attributor provides a lot of tracking support.
While Attributor::checkForAllUses will stop at call sites, that is “intended”. Most AA’s do the call site to Argument translation by themselves. That said, you could add a flag to look through call sites (=follow to the argument), as we look through return instructions (=follow to the call sites) and stores (=follow to the reloads), if we can.

1 Like

The goal is to identify whether a value is later used as parameter by a function that was only declared, but not defined, in the current module. To do so, we need to iterate through all uses, and if they’re functions that take the value as parameters we need to iterate through all uses of that parameter in that function.

Btw, this is roughly what my code for this looks like at the moment, and I think I need to add something in the deepest if statement where I left the comment and TODO:

// Tracks all visited values, and skips recursive call if we have already
// visited a certain value before (to avoid endless recursion).
void findAllFunctionsWhereValueIsPassedAsArgumentHelper(Value &V, SmallVector<Function*, 8> &FunctionCalls, std::set<Value*> &VisitedValues) {
  auto [_, ValueWasInserted] = VisitedValues.insert(&V);
  if (!ValueWasInserted) {
    // We found a value we have seen before, so were are in some sort of loop.
    // Therefore, we have already checked this value and all of its users.
    return;
  }

  for (User *U : V.users()) {
    if (CallInst *CI = dyn_cast<CallInst>(U)) {
      for (Value *Arg : CI->args()) {
        if (Arg == &V) {
          FunctionCalls.emplace_back(CI->getCalledFunction());

          // Therefore, once we see that a value is passed to a non-external function,
          // we still need to check if the value has other uses in in that function.
          // TODO
        }
      }
    }
    // Also add all other functions that use the function's return value
    findAllFunctionsWhereValueIsPassedAsArgumentHelper(*U, FunctionCalls, VisitedValues);
  }
}

// Find all function calls that use the specified value as an argument.
// Once we found a function, we also have to recursively find all of
// the functions that use that function('s return value).
void findAllFunctionsWhereValueIsPassedAsArgument(Value &V, SmallVector<Function*, 8> &FunctionCalls) {
  std::set<Value*> VisitedValues;
  return findAllFunctionsWhereValueIsPassedAsArgumentHelper(V, FunctionCalls, VisitedValues);
}

I’d recommend to write a simple AA for this:

struct AAEscapesIntoDeclared : public StateWrapper<BooleanState> {
...
};

stuct AAEscapesIntoDeclaredFunction : public AAEscapesIntoDeclared {
  ...
  ChangeStatus updateImpl(Attributor &A) {
auto HandleUse =  [&](Use &U, bool &Follow) {
    if (auto *CB = dyn_cast<CallBase>(U.getUser()) {
     // check use position, assume argument for now:
    auto *CSArgAA =  A.getAAFor(..., IRPosition::callsite_argument(*CB, ArgNo), ...);
    return CSArgAA && CSArgAA->isValidState();
   } 
   ...
  }
;

  if (! A.checkForAllUses(..., HandleUse,...))
    return indicatePessimisticFixpoint();
  return ChangeStatus::UNCHANGED;
  }
};
stuct AAEscapesIntoDeclaredCallSiteArgument : public AAEscapesIntoDeclared {
  ChangeStatus updateImpl(Attributor &A) {
   auto *Arg = getAssociatedArgument();
    if (!Arg)
     return indicatePessimisticFixpoint();
    auto *ArgAA =  A.getAAFor(..., IRPosition::callsite_argument(*Arg), ...);
    if (!ArgAA || !ArgAA->isValidState())
      return indicatePessimisticFixpoint();
    return ChangeStatus::UNCHANGED;
  }
};

stuct AAEscapesIntoDeclaredArgument : public AAEscapesIntoDeclaredFloat {
  void initialize(Attributor &A) {
    if (getAssociatedFunction()->isDeclaration())
      indicatePessimisticFixpoint();
  }
};

This will handle lot’s of corner cases one might easily miss. And if you allow the other AAs, it will also try to look through memory, and such.

1 Like

Thanks for the help! Though I must admit that I am not entirely sure how the code you sent fits in with mine. Do you mean I should replace my code with what you wrote? Also, what do you mean by AA? (I’m still fairly new to LLVM)

See here:

1 Like

My recommendation was to replace it. Though, you can also read the helpers (e.g., checkForAllUses) and adopt that code.

AAs are “AbstractAttributes”, which are “objects” deriving information through the Attributor (fix point iteration framework). There are YT talks on the topic, an intro and a tutorial. You can find most AAs, including “simple examples” in llvm/lib/Transforms/IPO/AttributorAttributes.cpp, maybe look for AAAddressSpace.

Depending on where you are heading with this, my proposal might be overkill. You can also just look at the call site to argument translation (e.g., IRPosition::getAssociatedArgument) and copy out the relevant parts:

    auto *Callee = dyn_cast_if_present<Function>(CB.getCalledOperand()); <- (Val) => Function * : Function * Val:
    if (Callee && Callee->arg_size() > unsigned(ArgNo))
      return Callee->getArg(ArgNo); 

You miss out on other features (e.g., callbacks and such) but it allows you to continue with your code.

1 Like

Do I understand the second, simpler approach with the getCalledOperand() correctly: When checking whether a value is a parameter of a function, we keep track of the index when iterating over args(). Then, we can call Callee->getArg(i), and that is technically a different Value (it’s the parameter to another function), but inherently is our value we found. So then we can just recursively call the findAllFunctions... function on that?

You ask the call base which argument operand number the Use of the value in question has. You also need to check it’s an argument use, etc. Once you have that number, you ask the call base for the callee (function), check if it has at least as many arguments, and pick the llvm::Argument from the function with the right number (see getAssociatedArgument, though, probably ignore the callback parts). This can fail if the call is varargs, if it is casted and the call site doesn’t really match, if the callee is not a function but a pointer, etc. (AACallEdges would deal with that for you). Then recurse on the Argument, which is a llvm::Value.

1 Like