Getting information from MemRegion

Hi,

I was using this code to get the variable name from the MemRegion, but it fails when we mix complicated pre-processor macros:

    const VarRegion *lockVR = dyn_cast_or_null<VarRegion>(lockR);
    const VarDecl *lockVD;
    if (lockVR)
      lockVD = lockVR->getDecl();
    else
      lockVD = NULL;

Then I would use lockVD->getName(), but for complicated macros, lockVR is null.

What I want to accomplish here is to just print the variable name as it is in the code (not how the macro expands the variable), so we could have:

#define lock(_mtx) pthread_mutex_lock((_mtx))

lock(foo);

At this point, I want to print 'foo'. Seems like I'm missing something because this doesn't work on all cases.

Any ideas?

Thanks,

Unfortunately, the analyzer runs well after the preprocessor does, so while you could probably reconstruct some of this information, it's probably more trouble than it's worth. You also won't necessarily be able to handle cases like

lock(*foo);

which the static analyzer can track (at least in theory). On the other hand, the simple example you gave *should* work anyway; unless lockR is already NULL, it should be a VarRegion whether or not it passes through a macro.

As a debugging aid, you can use MemRegion::dump() to print a description of the region. If you're printing for diagnostics, there are a few examples of "SummarizeRegion" methods among the checkers (which should probably be unified in the general CheckerHelpers.h file at some point). And to actually track data, you probably want to look at the existing PthreadLockChecker, which just associates 'locked' or 'unlocked' states with regions. (Think about 'locks[3]' or 'criticalData->lock', which would have other types of MemRegions.)

But I guess I don't know your real goals. Still, I hope that was helpful.

Jordy

Unfortunately, the analyzer runs well after the preprocessor does, so while you could probably reconstruct some of this information, it's probably more trouble than it's worth. You also won't necessarily be able to handle cases like

lock(*foo);

which the static analyzer can track (at least in theory). On the other hand, the simple example you gave *should* work anyway; unless lockR is already NULL, it should be a VarRegion whether or not it passes through a macro.

Yes, I guess my example was bad.

As a debugging aid, you can use MemRegion::dump() to print a description of the region. If you're printing for diagnostics, there are a few examples of "SummarizeRegion" methods among the checkers (which should probably be unified in the general CheckerHelpers.h file at some point). And to actually track data, you probably want to look at the existing PthreadLockChecker, which just associates 'locked' or 'unlocked' states with regions. (Think about 'locks[3]' or 'criticalData->lock', which would have other types of MemRegions.)

I'm actually working on the PthreadLockChecker. I need to write code to print useful warning messages. I'll look at SummarizeRegion.

Thanks,

Ok, I've written a test case. With my checker changes, I get:

% cat test.c
#include <pthread.h>

struct foo {
  pthread_mutex_t bar;
};

#define LOCK(foo) pthread_mutex_lock(&foo->bar)

int
main()
{
  struct foo *bar;

  bar = malloc(sizeof(struct foo));
  LOCK(bar);
  LOCK(bar);
}

% /usr/local/bin/clang --analyze -Xclang -analyzer-checker=unix.experimental.PthreadLock test.c
test.c:16:2: warning: Double locking 'element{SymRegion{conj_$1{void *}},0 S32b,struct foo}->bar'
        LOCK(bar);
        ^~~~~~~~~
test.c:7:19: note: instantiated from:
#define LOCK(foo) pthread_mutex_lock(&foo->bar)
                  ^ ~~~~~~~~~
1 warning generated.

My SummarizeRegion is like this:

bool PthreadLockChecker::SummarizeRegion(llvm::raw_ostream& os,
                                     const MemRegion *MR) const {
  const TypedRegion *TR = dyn_cast<TypedRegion>(MR);
  if (!TR)
    return false;

  switch (TR->getKind()) {
  case MemRegion::FunctionTextRegionKind: {
    const FunctionDecl *FD = cast<FunctionTextRegion>(TR)->getDecl();
    if (FD)
      os << "'" << FD << "'";
    return true;
  }
  case MemRegion::CXXThisRegionKind:
  case MemRegion::CXXTempObjectRegionKind:
  case MemRegion::FieldRegionKind:
  case MemRegion::VarRegionKind:
  case MemRegion::ObjCIvarRegionKind:
    os << "'" << TR->getString() << "'";
    return true;
  default:
    TR->dumpToStream(os);
    return false;
  }
}

Any ides on how to improve the output here?

Regards,