inlined&multi-thread CFG generation?

Hi,all

1.I try to make use of clang-cc's function on CFG generation,I found
that it generates CFGs for each function seperately,but not combining
them together to be a complete CFG for the whole program

clang-cc has such a function?

2.clang-cc can't not regconize a program written by C+some multi-thread
library(e.g. pthread),it means that, the generated CFG can't reflect the
parallel charactor of the program,so,it seems to very useful for
analysis a multi-thread program.

Is it a problem?

ZhuNan
2009.9.11

1.I try to make use of clang-cc's function on CFG generation,I found
that it generates CFGs for each function seperately,but not combining
them together to be a complete CFG for the whole program

clang-cc has such a function?

Hi Zhunan,

There are no plans to create a whole program CFG (or at least convert the current CFG implementation to being a whole program CFG, something of that nature could certainly be built on top of it). The infrastructure for whole program analysis will be implemented by layering other supporting data structures on top of CFGs. Please see my other email for more explanation behind this decision.

2.clang-cc can't not regconize a program written by C+some multi-thread
library(e.g. pthread),it means that, the generated CFG can't reflect the
parallel charactor of the program,so,it seems to very useful for
analysis a multi-thread program.

I'm entirely certain what you mean. Could you clarify a little more?

Cheers,
Ted

Hi,Ted

About the second question,I would like to express it by a example:

suppose I have the following src
...
for (i=0;i<2;i++)
  {
    pthread_attr_init (&tattr[i]);
    pthread_attr_getschedparam (&tattr[i], &param[i]);
    param[i].sched_priority = i+20;
    pthread_attr_setschedparam (&tattr[i], &param[i]);
      
    pthread_create(&hThread[i],&tattr[i],thread_func,(void *)&i);
  }

...

the current CFG generation strategy just treats it as the normal
behaviour of calling a function in a loop, in fact,when we treat with
POSIX Multi-thread or other multi-thread(e.g. OpenMP) programming
environment,we would like to see a CFG which describes exactly that two
thread are executing concurrently but not a normal function call, so
that we can get a CFG which shows that charactor of a "parallel
program".

Thank you for considering it!

ZhuNan
2009.9.12

在 2009-09-11五的 09:11 -0700,Ted Kremenek写道:

Hi Zhunan,

I think these kind of parallel execution semantics are certainly worth reasoning about in the static analysis engine, but I don't think they belong in the CFG itself. The CFG is meant to capture the basic control-flow as exhibited by statements and expressions. Other forms of control flow, such as using threads, blocks (a form of closures for C), etc., should be layered on top of the CFG abstraction.

For example, GRExprEngine, the path-sensitive analysis engine, generates a graph that represents the possible paths through a function. This graph layers itself on top of the CFG abstraction, but has far more precise control-flow information than what the CFG itself provides. This layering done by using the "ProgramPoint" class to represent a location within a function, which can reference an individual statement or a basic block in the CFG. We are working on adding support for having that graph trace paths across function call boundaries, which means we would reference locations in multiple CFGs, and we would possibly add call-and-return edges to capture context-sensitivity of function calls.

My thought is that modeling parallel programming semantics, e.g., where control-flow is asynchronous, really needs to be modeled with a similar layered abstraction. For example, since a context-switch between threads can happen at essentially any time, how would one model such "control-flow" between threads? Similarly, since threads can really be executing concurrently, how would the CFG capture parallel execution at multiple locations in the code? For me the answer is that the CFG is not designed for this purpose, and it serves as a foundational building block for these other kinds of reasoning.

So in summary, I think concurrency and interprocedural analysis is something that can be modeled in the analyzer, but it shouldn't be done in the base CFG data structure itself. The CFG serves as a primitive upon which such analyses can be layered.

Ted