Finding all values derived from a function argument

Hello!

I am trying to retrieve all instructions from a function's body that
are dependent on a specific argument. The strategy I am currently
using for that is to follow all uses of the argument and record them.
Also, whenever I encounter a store of a dependent value, I try to find
loads in the function that are dependent on that store and resume
use-tracking from there. For this purpose I am using the
MemoryDependenceAnalysis pass, but to be honest, I think I am not
using it correctly or not in the most efficient way.

Is such functionality - finding dependent instructions - present in
LLVM? If not, is there a more efficient way to determine dependent
instructions than like in the code below? In particular, I'm wondering
if it is possible to find all loads that depend on a particular store
more efficiently?

Thank you,
Stephan

    llvm::DenseSet<llvm::Instruction *> findDependents(llvm::Argument
*arg, llvm::Function *func)
    {
      std::vector<llvm::Instruction *> workList;
      for(llvm::Argument::use_iterator I = arg->use_begin(), E =
arg->use_end(); I != E; ++I)
      {
        if(llvm::Instruction *user = llvm::dyn_cast<llvm::Instruction>(*I))
          workList.push_back(user);
      }

      llvm::FunctionPassManager fpm(func->getParent());
      llvm::MemoryDependenceAnalysis *mda = new
llvm::MemoryDependenceAnalysis();
      fpm.add(mda);
      fpm.run(*func);

      // find all loads in the current function
      std::list<llvm::LoadInst *> loads;
      for(llvm::Function::iterator I = func->begin(), E = func->end();
I != E; ++I)
      {
        for(llvm::BasicBlock::iterator II = I->begin(), EE = I->end();
II != EE; ++II)
        {
          if(llvm::LoadInst *load = llvm::dyn_cast<llvm::LoadInst>(II))
            loads.push_back(load);
        }
      }

      llvm::DenseSet<llvm::Instruction *> deps;
      for(;:wink:
      {
        llvm::DenseSet<llvm::Instruction *> stores;
        while(!workList.empty())
        {
          llvm::Instruction *I = workList.back();
          workList.pop_back();

          if(deps.count(I) != 0)
            continue;

          deps.insert(I);

          if(llvm::StoreInst *store = llvm::dyn_cast<llvm::StoreInst>(I))
            stores.insert(I);

          for(llvm::Value::use_iterator II = I->use_begin(), EE =
I->use_end(); II != EE; ++II)
          {
            if(llvm::Instruction *user = llvm::dyn_cast<llvm::Instruction>(*II))
              workList.push_back(user);
          }
        }

        if(stores.size() > 0)
        {
          // find loads that depend on our stores, TODO: this seems fishy!
          for(std::list<llvm::LoadInst *>::iterator I = loads.begin(),
E = loads.end(); I != E; )
          {
            bool pick = false;
#if 1
            llvm::SmallVector<llvm::NonLocalDepResult, 16> results;
            mda->getNonLocalPointerDependency((*I)->getPointerOperand(),
true, (*I)->getParent(), results);
            for(llvm::SmallVector<llvm::NonLocalDepResult,
16>::iterator II = results.begin(), EE = results.end(); !pick && II !=
EE; ++II)
            {
              const llvm::MemDepResult &result = II->getResult();
              if(llvm::Instruction *dep = result.getInst())
              {
                if(llvm::StoreInst *store =
llvm::dyn_cast<llvm::StoreInst>(dep))
                  pick |= (stores.count(store) != 0);
              }
            }
#else
            llvm::MemDepResult result = mda->getDependency(*I);
            if(llvm::Instruction *dep = result.getInst())
            {
              if(llvm::StoreInst *store = llvm::dyn_cast<llvm::StoreInst>(dep))
                pick = (stores.count(store) != 0);
            }
#endif
            if(pick)
            {
              workList.push_back(*I);
              I = loads.erase(I);
            }
            else
              ++I;
          }

          stores.clear();
        }
        else
          break;
      }
      return deps;
    }

Hi Stephan,

I am trying to retrieve all instructions from a function's body that
are dependent on a specific argument. The strategy I am currently
using for that is to follow all uses of the argument and record them.
Also, whenever I encounter a store of a dependent value, I try to find
loads in the function that are dependent on that store and resume
use-tracking from there. For this purpose I am using the
MemoryDependenceAnalysis pass, but to be honest, I think I am not
using it correctly or not in the most efficient way.

Is such functionality - finding dependent instructions - present in
LLVM? If not, is there a more efficient way to determine dependent
instructions than like in the code below? In particular, I'm wondering
if it is possible to find all loads that depend on a particular store
more efficiently?

I don't think memory dependence analysis is appropriate for this. The function
llvm::PointerMayBeCaptured in CaptureTracking.cpp does something like what you
are trying to do. It used to track stores and loads from them, but this was
removed because it was not helpful in practice. The way it used to track
stores and loads was simple. First, it only tried to analysis stores to local
variables (AllocaInst). Stores to globals are harder since there are more ways
that other functions can get at the stored value. If it detected a store to an
AllocaInst, then it walked all uses of the AllocaInst and analysed those. A
load from the AllocaInst was considered to return the stored value (even though
more detailed analysis might have shown that it was loading some other value).
A store of the address of the local variable would also be tracked etc. This
was a flow insensitive analysis. However as I mentioned, it was all removed
because in practice either alloca's did not exist (because other passes had
promoted everything to registers) or their address was passed to some other
function where there was no way of knowing what the function did to it. It is
easy to construct toy examples where tracking AllocaInst's helped, but in real
code it did not.

Ciao,

Duncan.

I'm working on something (gcalloc removal) that would actually be
easier if I could easily answer the question "what loads from memory
*might* return a given pointer value or a pointer derived from it?" I
don't care if it's guaranteed to return the value, but it would be
helpful to be able to prove (for instance) that no pointer to object A
could possibly outlive some event or some other object. Even better
if I could do it interprocedurally.

Thanks Duncan,

I have changed my approach slightly: I am assuming all instructions to
be dependent on function arguments and then try to prove that some of
them are not. This gives me reliable, conservative results, which is
enough for my use case.

Still, I thought that something like this wasn't as unsual and
supported by built-in functionality in LLVM.

Best regards,
Stephan