checkBind: distinguish between MemRegionVal/ElementRegion

Hello,

In a checker, I want to distinguish between these kinds of statements:

  1. p = “/tmp/file”; // p is declared as char *p;

  2. *(p+3) = ‘S’; // I’m aware this is undefined behavior

If I’m not wrong, I’ve decided that the best place to be aware of that is in a check::Bind event:

void checkBind(SVal Loc, SVal Val, const Stmt *S, CheckerContext &C) const;

In the first case above, the location Loc is a MemRegionVal, and in the 2nd, it is an ElementRegion.

  1. To test if Loc is a MemRegionVal I use the following, but there’s something wrong I can’t figure out (it doesn’t compile), and I’m stuck (as far as I know, MemRegionVal is a subclass of SVal):

if (clang::isaloc::MemRegionVal(Loc)) …

  1. ElementRegion doesn’t belong to the SVal class hierarchy. How can I know if Loc is an ElementRegin?

Thanks a lot.

Hello,

After a bit of research, I’ve concluded the following:

Bearing in mind this signature:

void checkBind(SVal VLoc, SVal Val, const Stmt *S, CheckerContext &C) const;

To solve questions #1 and #2 in my previous message as:

Optional L = VLoc.getAs();
if (L) {
// VLoc is of type Loc
if (Optionalloc::MemRegionVal MR = L->getAsloc::MemRegionVal()) {
// VLoc isa MemRegionVal
const MemRegion *R = MR->getRegion()->StripCasts();
// Are we dealing with an ElementRegion?
if (const ElementRegion *ER = dyn_cast(R)) {
// VLoc is an ElementRegion
} else {
// VLoc is NOT an ElementRegion
}
} else {
// VLoc is NOT a MemRegionVal
}

I’m a bit confused about SVals and SymbolRefs. I’ve read this thread (http://lists.cs.uiuc.edu/pipermail/cfe-dev/2012-December/026641.html) which is quite clarifying. I wonder what the difference is between the following 3 statements:

SymbolRef sym = L->getAsLocSymbol();

SymbolRef sym = VLoc.getAsLocSymbol();

SymbolRef sym = VLoc.getAsSymbol();

My goal is to be able to detect something like the following. For that, I’d like the checker to store (as the checker’s ProgramState info) a variable name (symbol?). In the example below, when statement #1 is processed, the checker should store “p” as a means of tracking “p” for future references. Thus, the checker would be able to signal a warning when statement #2 is processed:

  1. p = “/tmp/file”; // p is declared as char *p;
  2. *(p+3) = ‘S’; // I’m aware this is undefined behavior

How can I get the symbolic value of variable “p”? I think the best is as a SymbolRef (because depending on variables scope, I might come across with another “p”, but I’m unsure)

Any hint or suggestion would be highly appreciated.

Many thanks.

Hello, Aitor. I’m afraid you’re still getting SVals, symbols, and MemRegions somewhat mixed up. They are not interchangeable. Have you watched our presentation on writing a checker yet? (Linked here: http://clang-analyzer.llvm.org/checker_dev_manual.html) I’m sorry it’s not really incorporated into the rest of the Checker Development Manual, but the video is probably still the clearest introduction to analyzer core concepts that we have.

This is a bit mundane—you can only use isa<> on pointers and references, but SVals are passed around by value. As you discovered, you can use getAs.

SymbolRef sym = L->getAsLocSymbol();

SymbolRef sym = VLoc.getAsLocSymbol();

SymbolRef sym = VLoc.getAsSymbol();

The second one will handle everything the first one handles, as well as locations cast to integer values (like “(intptr_t)&x”). The last one will also give you back symbols for non-location values. But not all memory regions are based on symbols (a local variable does not need a symbol), and of course not all symbolic values are memory regions (the result of random() is an integer).

That’s not really a good question. What you really want to know is if a given location is within a constant string region. That’s a much simpler question.

// Does this value represent the address of a region?
const MemRegion *MR = V.getAsRegion();
if (!MR)
return;

bool isString = isa(MR->getBaseRegion());

This isn’t going to cover all use cases, but it does cover this one much more nicely than trying to pattern-match on ElementRegion.

(Finally, of course, -fconst-strings is a much safer way to handle this kind of issue, but that doesn’t help if you have an existing codebase.)

Jordan

Watching that presentation you mention was one of the 1st things I did some time ago. I think I’ll watch it again to refresh.

I’ll reread again the docs with your comments in mind.

Thanks for the clarifications, Jordan.