Hi again,
Here it is my current implementation of DefUse chain analysis. I adjust the code to be compliant with some of the clang rules (I had some problem in switching from std::map to llvm::DenseMap and I will try to better investigate this issue in the future).
So as this is not a patch, but a complete new feature I don't know where it should be placed in the source tree (clang/Analysis seems fine), so I will let u decide. The analysis is composed by 2 files DefUse.h and DefUse.cpp which represent respectively the interface and the implementation of the DefUse analysis. The usage of DefUse is very similar to CFG. You build the all DefUse chains for a specific piece of code (usually a function body) and then u start querying the object.
#include "DefUse.h"
using namespace defuse;
clang::FunctionDecl* fd = ...
DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx);
now if we have a reference to a variable inside the function body (usually a DeclRefExpr) we can query the DefUse object to know if the VarRef is a "definition" or an "use":
DeclRefExpr* DR = ...
DU->isDef(DR);
DU->isUse(DR);
Now if our variable is a "definition" for example we can iterate among all the uses referred to the definition in the following way:
for(DefUse::uses_iterator I = DU->uses_begin(DR), E = DU->uses_end(); I != E, I++){
DeclRefExpr* U = *I;
...
}
If the variable is a use, we can iterate among all the definitions:
for(DefUse::defs_iterator I = DU->defs_begin(DR), E = DU->defs_end(); I != E, I++){
...
}
here the problem is that a definition can be either a DeclRefExpr in situations like:
-> a = s; // this is a definition for var. a
or a VarDecl in situations like:
-> int a; // this is a definition for variable a
For these reasons I have introduced a wrapper class called DefUseNode (the name can be changed) that can contains VarDecls or DeclRefExprs.
And that's the basic usage of DefUse.
A more advanced use is also possible, I don't want to go into the details of the algorithm used to compute the defuse chains, but you can imagine it divided into 2 steps. In the first steps all the variable references are collected and marked as definitions or uses depending on the semantic of the instruction and then, in the second step, this informations are elaborated in order to build the graph.
We can discuss about it but It comes out that marking the use of a variable can depends from the context, and the semantics can be different. For example if we are calling a function and the variable is passed as a reference or pointer we have to assume that this is a definition for the variable, but if we know the semantics of the function and we are sure the variable is not modified in the function body we can "risk" a little and mark the variable as an use. For this reason different semantics can be defined by the user in order to cover specific cases.
This is possible by extending the DefUseBasicHelper. The basic helper already implements the default semantics (a very conservative one) but it exposes some virtual methods that can be extended by the user who wants to implement a different behavior. The helper should be passed to the BuildDefUseChain method in this way:
DefUseCustomHelper helper;
DefUse* DU = DefUse::BuildDefUseChains(fd->getBody(), astCtx, &helper);
...
At last, I added a class to test the algorithm DefUseChainTest. I produces a lot of output (badly formatted) but you can easily see how ( and if
) the whole analysis works just by:
defuse::DefUseChainTest Consumer(llvm::outs());
...
ParseAST(PP, Consumer, *ContextOwner, false);
The test class is also meant to be an example of how to use the DefUse.
N.B.: As I already stated in my previous emails it is important (when and if the patch will be committed in the CLANG source tree) the institution where I am working, the "University of Innsbruck", and if possible my name (Simone Pellegrini) appear in the CREDITS.TXT file as contributors for this type of analysis.
have fun with DefUses, cheers Simone
P.S.: Unfortunately next week I will be on a business trip so not reachable. I will reply and address your comments/feedbacks as soon as I am back in the office.
Hi Simone,
Sorry for the delay in replying! The overall interface for this class looks pretty good. I have a bunch of comments with regards to the implementation and the coding style that we should iterate on before committing back to the Clang tree. My comments are inline with the code below. Feel free to reply whenever you have time.
As for where this code should live, I suggest:
include/clang/Analysis/Analyses/DefUse.h
lib/Analysis/DefUse.cpp
My comments reflect a first pass over this patch. There is a lot going on here, so I have focused only on a subset of the details.
//===-- clang/Analysis/DefUse.h - DefUse analysis -------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file contains the declaration of DefUse analysis headers
//
//===----------------------------------------------------------------------===//
#ifndef DEF_USE_HPP_
#define DEF_USE_HPP_
Please use "LLVM_CLANG_DEFUSE_H" to avoid any conflict with other defuse implementations.
#include "llvm/ADT/DenseMap.h"
#include "clang/AST/ASTConsumer.h"
#include "clang/AST/ParentMap.h"
#include "clang/AST/StmtVisitor.h"
Looks fine.
#include <exception>
#include <stdexcept>
We don't use C++ exceptions at all in LLVM, largely for performance reasons.
namespace clang {
// forward definitions
class CFG;
class CFGBlock;
namespace defuse {
//===------------------------- Exceptions -------------------------===//
struct NotDefExcpetion {
Stmt const* stmt;
NotDefExcpetion(Stmt const* stmt_): stmt(stmt_){}
};
struct NotUseExcpetion {
DeclRefExpr const* expr;
NotUseExcpetion(DeclRefExpr const* expr_): expr(expr_){}
};
These will need to be replaced with an alternate mechanism.
//===------------------------- DefUseNode -------------------------===//
// Keeps the information for a particular 'definition' or 'use' of a variable
// The structure is needed because a defuse chain can contains variable declarations
// as well as variable reference. As a DeclStmt can contains several declarations
// to address a particular variable we need to store its VarDecl or DeclRefExp
struct DefUseNode{
enum NodeKind{ VarDecl, VarRef };
enum UseKind{ Use, Def, UseDef };
DefUseNode(clang::VarDecl const* decl): var_decl(decl), kind(VarDecl), usage(Def){}
DefUseNode(DeclRefExpr const* ref, UseKind u = Use): var_ref(ref), kind(VarRef), usage(u){}
clang::VarDecl const* getDecl() const;
NodeKind const& getKind()const { return kind; }
UseKind const& getUse() const { return usage; }
clang::VarDecl const* getVarDecl() const {
assert(kind==VarDecl); return var_decl;
}
DeclRefExpr const* getVarRef() const {
assert(kind==VarRef); return var_ref;
}
bool operator==(DefUseNode const& n) const;
private:
// a def-use node can be either a VarDecl or a DeclRefExpr
union{
clang::VarDecl const* var_decl;
DeclRefExpr const* var_ref;
};
NodeKind kind;
UseKind usage;
};
Since this 'struct' has private data, please make it a class instead:
http://llvm.org/docs/CodingStandards.html#ci_class_struct
//===------------------------- Typedefs -------------------------===//
class DefUseBlock;
class VarDeclMap;
typedef std::vector<DefUseBlock> DefUseData;
typedef llvm::DenseMap<Stmt const*, unsigned> VarRefBlockMap;
typedef std::vector<DefUseNode> DefUseVect;
typedef std::vector<DefUseNode const*> VarRefsVect;
class DefUseBasicHelper;
//===------------------------- DefUse -------------------------===//
class DefUse{
ASTContext const& ctx;
DefUseData const* analysis_data;
VarDeclMap const* decl_map;
VarRefBlockMap const* block_map;
unsigned const num_cfg_blocks;
DefUse(ASTContext const& ctx_, DefUseData const* analysis_data_, VarDeclMap const* decl_map_,
VarRefBlockMap const* block_map_, unsigned num_blocks):
ctx(ctx_), analysis_data(analysis_data_), decl_map(decl_map_), block_map(block_map_),
num_cfg_blocks(num_blocks){ }
bool isDef(DefUseNode const& n) const;
struct iterator_impl{
To make MSVC happy, this should be a 'class'.
DefUse const* du;
DefUseNode const* node;
struct iter{
int block_id;
DefUseVect::const_iterator block_it;
iter(int blk_id): block_id(blk_id){}
};
iterator_impl(): du(NULL){ }
iterator_impl(DefUse const* du_): du(du_){ }
iterator_impl& operator++() { return inc(false); }
iterator_impl& operator++(int) { return inc(false); }
virtual iterator_impl& inc(bool) = 0;
};
public:
class uses_iterator: public std::iterator<std::input_iterator_tag, DeclRefExpr, std::ptrdiff_t, DeclRefExpr const*>,
public iterator_impl{
iter iter_ptr;
bool inDefBlock;
uses_iterator(): iterator_impl(), iter_ptr(-1), inDefBlock(true){}
uses_iterator(DefUse const* du, DefUseNode const& n);
uses_iterator& inc(bool first);
friend class DefUse;
public:
virtual bool operator!=(uses_iterator const& iter);
Does this need to be virtual?
virtual DeclRefExpr const* operator*();
Does this need to be virtual?
};
class defs_iterator: public std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t, DefUseNode const*>,
public iterator_impl{
struct iter_: public iter{
VarRefsVect::const_iterator reaches_it;
iter_(int blk_id): iter(blk_id){}
} iter_ptr;
bool blockDef;
defs_iterator(): iterator_impl(), iter_ptr(-1), blockDef(false) {}
defs_iterator(DefUse const* du, DeclRefExpr const& n);
defs_iterator& inc(bool first);
friend class DefUse;
public:
bool operator!=(defs_iterator const& iter);
DefUseNode const* operator*();
};
// USES //
defs_iterator defs_begin(DeclRefExpr const* var) const;
defs_iterator defs_end() const;
Please add doxygen comments for these methods.
// DEFS //
uses_iterator uses_begin(DeclRefExpr const* var) const;
uses_iterator uses_begin(VarDecl const* var) const;
uses_iterator uses_end() const;
Please add doxygen comments for these methods.
bool isUse(DeclRefExpr const* var) const;
bool isDef(DeclRefExpr const* var) const { return isDef(DefUseNode(var, DefUseNode::Def)); }
bool isDef(VarDecl const* var) const { return isDef(DefUseNode(var)); }
~DefUse();
static DefUse* BuildDefUseChains(Stmt* body, ASTContext *ctx, DefUseBasicHelper* helper = 0,
CFG* cfg = 0, ParentMap* pm = 0, bool verbose = false,
llvm::raw_ostream& out = llvm::outs());
};
I think having a "defuse" namespace is fine (for all the special typedefs, etc.), but it should only contain the implementation details for DefUse. The DefUse class should be in the clang namespace.
//===------------------------- DefUseBasicHelper -------------------------===//
It seems like DefUseBasicHelper allows a bunch of hooks for subclasses via the 'HandleXXX' methods, but this level of hooks is not going to fair well from a performance standpoint. I believe that one base class can implement most of the HandleXXX logic without making those methods virtual (and thus dispense with the HandleXXX methods entirely and just have VisitXXX methods). I think most subclasses will need very few hooks, if any at all. I think it's worth identifying the one or two methods that need to be virtual and then leave the rest of them as non-virtual methods.
Also, if this truly is the base class, how about calling it 'DefUseHelper'? That seems more succinct and clear from an interface perspective.
class DefUseBasicHelper: public StmtVisitor<DefUseBasicHelper>{
class DefUseBasicHelperImpl;
DefUseBasicHelperImpl* pimpl;
void InitializeValues(DefUseData* data, VarRefBlockMap* bm, VarDeclMap* dm);
friend class DefUse;
public:
DefUseNode::UseKind current_use;
DefUseBasicHelper();
virtual void HandleDeclRefExpr(DeclRefExpr *DR); // remember to call the super class implementation of the method
void HandleDeclStmt(DeclStmt *DS);
virtual void HandleBinaryOperator(BinaryOperator* B);
virtual void HandleConditionalOperator(ConditionalOperator* C);
virtual void HandleCallExpr(CallExpr* C);
virtual void HandleUnaryOperator(UnaryOperator* U);
virtual void HandleArraySubscriptExpr(ArraySubscriptExpr* AS);
// ...
Is there a need for the '// ..' line?
void VisitDeclRefExpr(DeclRefExpr *DR){ return HandleDeclRefExpr(DR); }
void VisitDeclStmt(DeclStmt *DS){ return HandleDeclStmt(DS); }
void VisitBinaryOperator(BinaryOperator* B){ return HandleBinaryOperator(B); }
void VisitConditionalOperator(ConditionalOperator* C){ return HandleConditionalOperator(C); }
void VisitCallExpr(CallExpr* C){ return HandleCallExpr(C); }
void VisitUnaryOperator(UnaryOperator* U){ return HandleUnaryOperator(U); }
void VisitArraySubscriptExpr(ArraySubscriptExpr* AS){ return HandleArraySubscriptExpr(AS); }
// ...
Is there a need for the '// ..' line?
void VisitCFGBlock(clang::CFGBlock const& blk, int root);
The 'root' argument is not necessary and should be removed (see comment below).
void VisitStmt(Stmt* S);
virtual ~DefUseBasicHelper();
};
//===------------------------- DefUseChainTest -------------------------===//
class DefUseChainTest: public StmtVisitor<DefUseChainTest>, public ASTConsumer {
llvm::raw_ostream& out;
ASTContext* ctx;
DefUse const* du;
public:
DefUseChainTest(llvm::raw_ostream& out_): out(out_), ctx(NULL), du(NULL) { }
void Initialize(ASTContext& context) { ctx = &context; }
void HandleTopLevelDecl(DeclGroupRef declGroupRef);
// void HandleTranslationUnit(clang::ASTContext& context);
void VisitDeclRefExpr(DeclRefExpr *DR);
void VisitDeclStmt(DeclStmt *DS);
void VisitStmt(Stmt* S);
};
void PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR, ASTContext& ctx, llvm::raw_ostream& out);
void PrintVarUses(DefUse const* DU, DeclRefExpr const* DR, ASTContext& ctx, llvm::raw_ostream& out);
void PrintVarUses(DefUse const* DU, VarDecl const* VD, ASTContext& ctx, llvm::raw_ostream& out);
} // end namespace defuse
The DefUseChainTest can be made entirely private in DefUse.cpp, and have a static function create it (and return an ASTConsumer). There is no need to have it in the public API interface.
} // end namespace clang
#endif /* DEF_USE_HPP_ */
Some compilers complain about the extra tokens after #endif. Please remove this comment.
//===-- clang/Analysis/DefUse.cpp - DefUse analysis -------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// This file contains the implementation of DefUse analysis
//
//===----------------------------------------------------------------------===//
#define __STDC_LIMIT_MACROS
#define __STDC_CONSTANT_MACROS
What are these two defines used for?
#include "DefUse.h"
#include "clang/Basic/SourceManager.h"
#include "clang/Analysis/CFG.h"
#include <llvm/Support/raw_ostream.h>
#include <set>
#include <map>
Eventually the code should probably use DenseSet and DenseMap for performance reasons, as std::map and std::set are very slow and make heavy use of malloc(). I understand that a significant amount of code may need to be tweaked, but the performance gain can easily be 2-3x.
using namespace clang;
using namespace defuse;
// ASSERT MACRO
#define ASSERT(COND, MSG) do{ \
std::string str; \
llvm::raw_string_ostream ss(str); \
ss << MSG; \
assert((COND) && ss.str().c_str()); }while(false)
I don't think this will do what you intend. If I did:
ASSERT(false, 10)
I'd see:
"false && ss.str().c_str()"
as the failed assertion condition in the terminal output.
One option to get better error reporting is to use the facilities in 'clang/Basic/PrettyStackTrace.h', which are used in various places in Clang. These can be used to print the location within the analyzed source that the assertion fails. In general we don't like specialized error reporting machinery for just one file (as in the case of 'ASSERT'), but like adding to the general error reporting machinery that all of LLVM/Clang can use.
I also don't think this ASSERTION macro adds much value and causes some of the client code below to be structured in some slightly suboptimal ways.
// utility functions
unsigned Line(SourceLocation const& l, clang::SourceManager const& sm) {
PresumedLoc pl = sm.getPresumedLoc(l);
return pl.getLine();
}
unsigned Column(SourceLocation const& l, SourceManager const& sm) {
PresumedLoc pl = sm.getPresumedLoc(l);
return pl.getColumn();
}
Instead of adding these functions, we can add 'getPresumedLoc' to FullSourceLoc and then use the 'getColumn()' and 'getLine()' methods from that class.
std::string PrintClangStmt(Stmt const* stmt, ASTContext const& ctx){
std::string str;
llvm::raw_string_ostream pp(str);
stmt->printPretty(pp, 0, PrintingPolicy(ctx.getLangOptions()), 0);
return pp.str();
}
I don't think this function is needed. If you need it, please declare it static to keep it private to this file.
//===------------------------- DefUseNode -------------------------===//
inline bool defuse::DefUseNode::operator==(DefUseNode const& n) const{
if(n.usage != usage || n.kind != kind)
return false;
Please add a space between 'if' and the '('. This is the coding style used throughout LLVM.
Also, please use 2 space indentation like the rest of LLVM and Clang.
if(kind == VarDecl)
return var_decl == n.var_decl;
else
return var_ref == n.var_ref;
return false;
}
The 'return false' is dead code. Please remove it.
This can be more succinctly written as:
return kind == VarDecl ? var_decl == n.var_decl : var_Ref == n.var_ref;
Alternatively:
if (kind == VarDecl)
return var_decl == n.var_decl;
return var_ref == n.var_ref;
inline clang::VarDecl const* DefUseNode::getDecl() const{
if(kind == VarRef)
return cast<clang::VarDecl>(var_ref->getDecl());
return cast<clang::VarDecl>(var_decl);
}
Same as the previous comment.
if (kind == VarRef)
return cast<VarDecl>(var_ref->getDecl());
return cast<VarDecl>(var_decl);
or:
return kind == VarRef ? cast<VarDecl>(var_ref->getDecl()) : cast<VarDecl>(var_decl);
typedef std::set<DefUseNode const*> VarRefsSet;
typedef std::set<VarDecl const*> VarDeclsSet;
typedef std::map<VarDecl const*, DefUseVect> DefUseChain;
Please use llvm::DenseSet and llvm::DenseMap. You'll need to use pointers to DefUseVects instead (and then destroy them), but the speed improvement will be significant.
//===------------------------- DefUseBlock -------------------------===//
struct defuse::DefUseBlock: public DefUseChain{
This should be 'class' instead of 'struct'. You declared it earlier as a 'class'. While gcc will eat this, MSVC will not. Also, since this isn't POD, it should be a class anyway.
VarDeclsSet uses;
VarRefsVect defsout;
VarDeclsSet killed;
VarRefsVect reaches;
class not_killed_set_iterator: public std::iterator<std::input_iterator_tag, DefUseNode, std::ptrdiff_t, DefUseNode const*>{
DefUseBlock& block;
VarRefsVect::const_iterator ptr_it;
not_killed_set_iterator& inc(bool first){
if(ptr_it != block.reaches.end() && !first)
ptr_it++;
Please use '++ptr_it' instead of 'ptr_it++'. While it makes no difference in this case, general speaking it performs better and there is no reason to make any assumptions about the underlying data structure and how the iterators are implemented.
while(ptr_it != block.reaches.end() &&
(block.killed.find((*ptr_it)->getDecl()) != block.killed.end() &&
std::find(block.defsout.begin(), block.defsout.end(), *ptr_it) == block.defsout.end()))
ptr_it++;
return *this;
}
public:
not_killed_set_iterator(DefUseBlock& block_, VarRefsVect::const_iterator ptr_it_): block(block_), ptr_it(ptr_it_){ inc(true); }
not_killed_set_iterator& operator++() { return inc(false); }
not_killed_set_iterator& operator++(int) { return inc(false); }
Do we even need an operator++(int)? By disallowing it people won't accidentally use it. This also isn't the standard semantics of this operator, so by implementing it this way clients will silently be doing the wrong thing instead of getting a compilation error.
bool operator!=(not_killed_set_iterator const& iter) { return !((&iter.block) == &(block) && iter.ptr_it == ptr_it); }
This method can (and should) be declared const.
const DefUseNode* const& operator*() const { return *ptr_it; }
};
not_killed_set_iterator begin_not_killed_set(){ return DefUseBlock::not_killed_set_iterator(*this, reaches.begin()); }
not_killed_set_iterator end_not_killed_set(){ return DefUseBlock::not_killed_set_iterator(*this, reaches.end()); }
Please add a space between () and {
};
//===------------------------- VarDeclMap -------------------------===//
struct defuse::VarDeclMap: public StmtVisitor<VarDeclMap>, public llvm::DenseMap<VarDecl const*, DeclStmt const*> {
LLVM has an 80 column coding convention:
http://llvm.org/docs/CodingStandards.html#scf_codewidth
Please wrap lines that exceed this length.
VarDeclMap(Stmt const* body){ VisitStmt(const_cast<Stmt*>(body)); }
void VisitDeclStmt(DeclStmt *DS){
for(DeclStmt::decl_iterator it = DS->decl_begin(); it != DS->decl_end(); it++){
if((*it)->getKind() != Decl::Var)
continue;
Use dyn_cast instead of getKind.
insert(std::make_pair(cast<VarDecl>(*it), DS));
}
}
void VisitStmt(Stmt* S){
for(Stmt::child_iterator I = S->child_begin(), E = S->child_end(); I != E; ++I)
if(*I) Visit(*I);
}
};
Please make this is a class, as the earlier declaration was also a class.
Please also add a space between 'for' and '('. This is the LLVM coding convention.
//===------------------------- DefUseBasicHelper -------------------------===//
struct defuse::DefUseBasicHelper::DefUseBasicHelperImpl{
DefUseData* data;
VarRefBlockMap* bm;
VarDeclMap* dm;
unsigned blockID;
DefUseBasicHelperImpl(): data(NULL), bm(NULL), dm(NULL){}
};
defuse::DefUseBasicHelper::DefUseBasicHelper(): pimpl(new DefUseBasicHelperImpl), current_use(DefUseNode::Use){}
defuse::DefUseBasicHelper::~DefUseBasicHelper(){ delete pimpl; }
void defuse::DefUseBasicHelper::InitializeValues(DefUseData* data, VarRefBlockMap* bm, VarDeclMap* dm){
pimpl->data = data; pimpl->bm = bm; pimpl->dm = dm;
}
void defuse::DefUseBasicHelper::HandleDeclRefExpr(DeclRefExpr *DR){
VarRefBlockMap& bm = *pimpl->bm;
unsigned blockID = pimpl->blockID;
DefUseBlock& block_data = (*pimpl->data)[blockID];
if(DR->getDecl()->getKind() == Decl::Var || DR->getDecl()->getKind() == Decl::ParmVar){
The condition is overly verbose. We also discourage the use of 'getKind' unless you are using it in a switch statement.
Suggestion:
Decl *D = DR->getDecl();
if (isa<VarDecl>(D)) {
...
}
isa<> uses the value returned from VarDecl::classof() to determine if 'D' is a VarDecl.
VarDecl* VD = cast<VarDecl>(DR->getDecl());
Actually, instead of using isa, here is a better style:
if (VarDecl *VD = dyn_cast<VarDecl>(DR->getDecl()) {
...
}
The dyn_cast combines 'isa' with 'cast', and returns a NULL pointer if isa would return false. This is the encouraged style.
// map this variable to the current block
bm[DR] = blockID;
if(block_data.find(VD) == block_data.end()){
block_data.insert(std::make_pair(VD, DefUseVect()));
block_data.uses.insert(VD); // this variable is used in the current block and it has
// no prior definition within the block
If you end up using a DenseMap, this code can be made faster by avoiding the double lookup (the first for block_data.find() and the second for the insertion):
DefUseBlock *&B = block_data[VD];
if (!B) {
B = new DefUseBlock();
B->users.insert(VD);
}
Using a DenseMap (which involves changing the data value to a pointer), the first line involves a quick hashtable lookup and creates a NULL pointer entry if the key could not be found. The code that follows conjures up the new DefUseBlock without having to do a second search of the map.
}
if(current_use == DefUseNode::UseDef){ // in case of compound operators
block_data[VD].push_back(DefUseNode(DR, DefUseNode::Use));
block_data[VD].push_back(DefUseNode(DR, DefUseNode::Def));
These lines cause two identical (and redundant) lookups to block_data. Consider:
DefUseBlock &B = block_data[VD];
B.push_back(...);
}else
block_data[VD].push_back(DefUseNode(DR, current_use));
}
}
Please add spaces between 'if' and '(' and 'else' and '{'.
void defuse::DefUseBasicHelper::HandleDeclStmt(DeclStmt *DS){
for(DeclStmt::decl_iterator I = DS->decl_begin(), E = DS->decl_end(); I != E; I++){
if((*I)->getKind() != Decl::Var)
continue;
VarRefBlockMap& bm = *pimpl->bm;
VarDeclMap& dm = *pimpl->dm;
unsigned blockID = pimpl->blockID;
DefUseBlock& block_data = (*pimpl->data)[blockID];
VarDecl* VD = cast<VarDecl>(*I);
if(block_data.find(VD) != block_data.end())
ASSERT(true, "Variable '" << VD->getNameAsString() << "' redefined in the same block");
Why not do:
assert(block_data.find(VD) != block_data.end() && "Variable redefined in the same block");
It is far more succinct, doesn't add more code in a Release build (as ASSERT does), and if we really care what VD is we can run this code in the debugger (which we would likely need to do anyway). It also reduces the number of lookups in the map in a Release build from two to one.
void defuse::DefUseBasicHelper::HandleBinaryOperator(BinaryOperator* B){
DefUseNode::UseKind backup = current_use; // backup the usage
if(B->isAssignmentOp()){
current_use = DefUseNode::Use;
Visit(B->getRHS());
current_use = DefUseNode::Def;
if(B->isCompoundAssignmentOp())
current_use = DefUseNode::UseDef;
Visit(B->getLHS());
}else{
Visit(B->getLHS());
current_use = DefUseNode::Use;
Visit(B->getRHS());
}
current_use = backup; // write back the usage to the current usage
}
void defuse::DefUseBasicHelper::HandleConditionalOperator(ConditionalOperator* C){ }
void defuse::DefUseBasicHelper::HandleCallExpr(CallExpr* C) {
DefUseNode::UseKind backup = current_use; // backup the usage
for(CallExpr::arg_iterator I = C->arg_begin(), E = C->arg_end(); I != E; ++I){
if((*I)->getType()->isPointerType() || (*I)->getType()->isReferenceType())
current_use = DefUseNode::Def;
Visit(*I);
current_use = backup;
}
}
void defuse::DefUseBasicHelper::HandleUnaryOperator(UnaryOperator* U){
DefUseNode::UseKind backup = current_use; // backup the usage
switch(U->getOpcode()){
case UnaryOperator::PostInc:
case UnaryOperator::PostDec:
case UnaryOperator::PreInc:
case UnaryOperator::PreDec:
current_use = DefUseNode::UseDef;
break;
case UnaryOperator::Plus:
case UnaryOperator::Minus:
case UnaryOperator::Not:
case UnaryOperator::LNot:
current_use = DefUseNode::Use;
break;
case UnaryOperator::AddrOf:
case UnaryOperator::Deref:
// use the current_use
break;
default:
// DEBUG("Operator " << UnaryOperator::getOpcodeStr(U->getOpcode()) << " not supported in def-use analysis");
break;
}
Visit(U->getSubExpr());
current_use = backup; // write back the usage to the current usage
}
void defuse::DefUseBasicHelper::HandleArraySubscriptExpr(ArraySubscriptExpr* AS){
DefUseNode::UseKind backup = current_use; // backup the usage
Visit(AS->getBase());
current_use = DefUseNode::Use;
Visit(AS->getIdx());
current_use = backup; // write back the usage to the current usage
}
void defuse::DefUseBasicHelper::VisitCFGBlock(CFGBlock const& blk, int root){
pimpl->blockID = blk.getBlockID();
for(CFGBlock::const_iterator I = blk.begin(), E = blk.end(); I != E; I++)
Visit(*I);
unsigned blockID = pimpl->blockID;
DefUseBlock& block_data = (*pimpl->data)[blockID];
// Post processing
// build up the uses(b), defsout(b) and killed(b)
// for each variable
for(DefUseBlock::iterator I = block_data.begin(), E = block_data.end(); I != E; ++I){
// in case of references to ParmVar what we do is insert the variable definition in the first block of the CFG
DefUseBlock& root_data = (*pimpl->data)[root];
if(I->first->getKind() == Decl::ParmVar &&
std::find(root_data[I->first].begin(), root_data[I->first].end(), DefUseNode(I->first)) == root_data[I->first].end()){
root_data[I->first].push_back(DefUseNode(I->first));
root_data.defsout.push_back(&root_data[I->first].back());
}
// if this is a local variable, should't be in the defsout,killed sets
// defsout is calculated by iterating backwards to find the last use of the variable
for(DefUseVect::const_reverse_iterator UI = I->second.rbegin(), UE = I->second.rend(); UI != UE; UI++)
if(UI->getUse() == DefUseNode::Def && (block_data.defsout.empty() >> block_data.defsout.back()->getDecl() != I->first)){
block_data.defsout.push_back(&(*UI));
block_data.killed.insert(UI->getDecl());
break;
}
}
}
void defuse::DefUseBasicHelper::VisitStmt(Stmt* S){
for(Stmt::child_iterator I = S->child_begin(), E = S->child_end(); I != E; ++I)
if(*I) Visit(*I);
}
bool ContainsStmt(Stmt const* block, Stmt const* stmt){
if(block == stmt)
return true;
for(Stmt::const_child_iterator I = block->child_begin(), E = block->child_end(); I != E; ++I)
if(*I){
if(ContainsStmt(*I, stmt)) return true;
}
How about:
if (*I && ContainsStmt(*I, stmt))
return true;
return false;
}
Stmt const* GetEnclosingBlock(ParentMap const& pm, Stmt const* stmt){
Stmt const* E = stmt;
while(E){
Stmt const* cond = NULL;
switch(E->getStmtClass()){
case Stmt::IfStmtClass:
cond = cast<IfStmt>(E)->getCond();
break;
case Stmt::WhileStmtClass:
cond = cast<WhileStmt>(E)->getCond();
break;
case Stmt::DoStmtClass:
cond = cast<DoStmt>(E)->getCond();
break;
case Stmt::ForStmtClass:
cond = cast<ForStmt>(E)->getCond();
break;
case Stmt::SwitchStmtClass:
cond = cast<SwitchStmt>(E)->getCond();
break;
case Stmt::CompoundStmtClass:
return E;
default:
break;
}
if(cond && !ContainsStmt(cond, stmt))
return E;
E = pm.getParent(E);
}
return NULL;
}
void RemoveLocalDefs(CFG const& cfg, ParentMap const& pm, VarDeclMap& decl_vars, DefUseData& data){
for(CFG::const_reverse_iterator it = cfg.rbegin(), cend = cfg.rend(); it != cend; it++) {
// every time we find a node with more than 1 predecessor it means
// we are at a junction point and the local variable defined in the previuos
// blocks must be removed from the defsout set
if(it->pred_size() <= 1)
continue;
// for each predecessor
for(CFGBlock::const_pred_iterator pit = it->pred_begin(), end = it->pred_end(); pit != end && (*pit)->size(); pit++){
DefUseBlock& block_data = data[(*pit)->getBlockID()];
for(VarRefsVect::iterator dit = block_data.defsout.begin(); dit != block_data.defsout.end(); dit++){
// we have to be sure the element we are using to see if the Decl is in the same block of block statements
// is also present in the AST, so if the first statement of the CFG block is a VarDecl, we use the VarDeclMap
// to retrieve the corresponding AST node
Stmt const* testStmt = (*pit)->front();
if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
testStmt = decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
// if the variable is local to one of the prec blocks and we are at a junction point
// it means the variable must be removed from the defsout set
if(GetEnclosingBlock(pm, decl_vars[(*dit)->getDecl()]) == GetEnclosingBlock(pm, testStmt)){
// it can be the case the following block is in the same scope of the previous block (for example if stmt without else)
Stmt const* testStmt = it->front();
if(testStmt->getStmtClass() == Stmt::DeclStmtClass)
testStmt = decl_vars[cast<VarDecl>(cast<DeclStmt>(testStmt)->getSingleDecl())];
Use 'dyn_cast' or 'isa' instead of 'getStmtClass()'
if(GetEnclosingBlock(pm, decl_vars[(*dit)->getDecl()]) != GetEnclosingBlock(pm, testStmt)){
block_data.defsout.erase(dit--);
}
}
}
}
}
}
// implements reaching definitions according to Kennedy's book iterative algorithm
void ComputeReaches(CFG const& cfg, DefUseData& data){
bool changed = true;
while(changed){
changed = false;
VarRefsSet newreaches;
for(CFG::const_reverse_iterator it = cfg.rbegin(); it != cfg.rend(); it++) {
DefUseBlock& BlockData = data[it->getBlockID()];
newreaches = VarRefsSet();
for(CFGBlock::const_pred_iterator pit = it->pred_begin(), end = it->pred_end(); pit != end; pit++){
unsigned p = (*pit)->getBlockID();
// s(p) = reaches(p) - killed(p)
VarRefsSet s, temp_set, temp_set_1;
copy(data[p].begin_not_killed_set(), data[p].end_not_killed_set(), inserter(s, s.begin()));
// tmp_set(p) = defsout(p) U s(p)
set_union(s.begin(), s.end(), data[p].defsout.begin(), data[p].defsout.end(), inserter(temp_set, temp_set.begin()));
// tmp_set_1 = newreaches U tmp_set(p)
set_union(newreaches.begin(), newreaches.end(), temp_set.begin(), temp_set.end(), inserter(temp_set_1, temp_set_1.begin()));
newreaches.swap(temp_set_1);
}
if(BlockData.reaches.size() != newreaches.size()){
BlockData.reaches = VarRefsVect(newreaches.begin(), newreaches.end());
changed = true;
}
}
}
}
//===------------------------- Printing utilities -------------------------===//
struct printer: public std::iterator<std::input_iterator_tag, printer, std::ptrdiff_t, printer>{
printer(llvm::raw_ostream& x, ASTContext const& ctx_, const char* s = "") : o(x), ctx(ctx_), delim(s) { }
template<typename T>
printer& operator=(const T& x);
printer& operator*() { return *this; }
printer& operator++() { return *this; }
printer& operator++(int) { return *this; }
mutable llvm::raw_ostream& o;
ASTContext const& ctx;
const char* delim;
};
template<>
printer& printer::operator=<VarDecl const*>(const VarDecl* const& x){
o << x->getNameAsString() << delim;
return *this;
}
template<>
printer& printer::operator=<DefUseNode const*>(const DefUseNode* const& x){
SourceLocation loc;
if(x->getKind() == DefUseNode::VarRef)
loc = x->getVarRef()->getLocStart();
else
loc = x->getVarDecl()->getTypeSpecStartLoc();
o << x->getDecl()->getNameAsString() << " " << "(" << Line(loc, ctx.getSourceManager()) << ","
<< Column(loc, ctx.getSourceManager()) << " "
<< ((x->getUse() == DefUseNode::Use)?"USE":"DEF") << ")" << delim;
return *this;
}
template<>
printer& printer::operator=<DefUseNode>(DefUseNode const& x){ return this->operator=(&x); }
void printDefUseData(llvm::raw_ostream& out, ASTContext const& ctx, DefUseData const& data){
out << "--------------- PRINTING ANALYSIS DATA ----------------------\n";
unsigned blockID = data.size() - 1;
for(DefUseData::const_reverse_iterator it = data.rbegin(); it != data.rend(); it++, blockID--){
out << "* BlockID '" << blockID << "' uses vars:\n";
for(DefUseBlock::const_iterator pit = it->begin(); pit != it->end(); pit++){
out << "'" << pit->first->getNameAsString() << "':\n\t";
std::copy( pit->second.begin(), pit->second.end(), printer( out, ctx, ", " ) );
out << "\n";
}
}
blockID = data.size() - 1;
for(DefUseData::const_reverse_iterator it = data.rbegin(); it != data.rend(); it++, blockID--) {
DefUseBlock const& BlockData = *it;
out << "---------------- Block: " << blockID << " ----------------\n";
out << "# uses(" << blockID << "): {";
copy( BlockData.uses.begin(), BlockData.uses.end(), printer( out, ctx, ", " ) );
out << "}\n";
out << "# defsout(" << blockID << "): {";
copy( BlockData.defsout.begin(), BlockData.defsout.end(), printer( out, ctx, ", " ) );
out << "}\n";
out << "# killed(" << blockID << "): {";
copy( BlockData.killed.begin(), BlockData.killed.end(), printer( out, ctx, ", " ) );
out << "}\n";
out << "# reaches(" << blockID << "): {";
copy( BlockData.reaches.begin(), BlockData.reaches.end(), printer( out, ctx, ", " ) );
out << "}\n";
}
}
//===------------------------- DefUse -------------------------===//
DefUse::~DefUse(){ delete analysis_data; delete decl_map; delete block_map; }
bool isDef(ASTContext const& ctx, DefUseNode const& n, int* blockID, DefUseVect::const_iterator* iter,
DefUseData const* analysis_data, VarDeclMap const* decl_map, VarRefBlockMap const* block_map, Stmt const** stmt = NULL){
// retrieve the blockID of the CFG where the statement N (which can be wither a VarDecl or a DeclRefExpr) is placed
Stmt const* _stmt_;
if(n.getKind() == DefUseNode::VarDecl){
VarDecl const* vd = n.getVarDecl();
VarDeclMap::const_iterator it = decl_map->find(vd);
// we use the decl_map to translate the VarDecl to the corresponding DeclStmt
// this is useful when DeclStmts from the CFG (which usually are different from the original ones) are used
ASSERT(it != decl_map->end(), "Variable declaration for '" << vd->getNameAsString() << "' not mapped to any statement in the syntax tree");
_stmt_ = it->second;
}else
_stmt_ = n.getVarRef();
assert(_stmt_ != NULL);
VarRefBlockMap::const_iterator it = block_map->find(_stmt_);
ASSERT(it != block_map->end(), "Statement '" << PrintClangStmt(_stmt_, ctx) << "' not mapped to any block of the CFG");
// We have to find the definition inside the block
DefUseBlock::const_iterator data_it = (*analysis_data)[it->second].find(n.getDecl());
// if we don't find the variable among the variable used or defined in the block it means something went wrong
ASSERT(data_it != (*analysis_data)[it->second].end(), "Block ID: " << it->second << " doesn't contains usage or definitions of variable " << n.getDecl()->getNameAsString());
// we find a Variable definition, now we have to iterate through the DefUseNodes in order to find the
// the node corresponding to the n variable passed by the user
DefUseVect::const_iterator _iter_ = std::find(data_it->second.begin(), data_it->second.end(), n);
if(blockID) *blockID = it->second;
if(iter) *iter = _iter_;
if(stmt) *stmt = _stmt_;
return _iter_ != data_it->second.end();
}
bool isUse(ASTContext const& ctx, clang::DeclRefExpr const* var, int* blockID, DefUseVect::const_iterator* iter,
DefUseData const* analysis_data, VarRefBlockMap const* block_map){
Stmt const* stmt = var;
VarRefBlockMap::const_iterator it = block_map->find(stmt);
ASSERT(it != block_map->end(), "Statement " << PrintClangStmt(stmt, ctx) << " not mapped to any block of the CFG!");
DefUseBlock::const_iterator data_it = (*analysis_data)[it->second].find(cast<VarDecl>(var->getDecl()));
ASSERT(data_it != (*analysis_data)[it->second].end(),
"Block ID: " << it->second << " doesn't contains usage or definitions of variable " << var->getDecl()->getNameAsString());
DefUseVect::const_iterator _iter_ = std::find(data_it->second.begin(), data_it->second.end(), DefUseNode(var, DefUseNode::Use));
if(blockID) *blockID = it->second;
if(iter) *iter = _iter_;
return _iter_ != data_it->second.end();
}
//===------------------------- uses_iterator -------------------------===//
DefUse::uses_iterator::uses_iterator(DefUse const* du, DefUseNode const& n):
DefUse::iterator_impl(du), iter_ptr(-1), inDefBlock(true)
{
Stmt const* stmt = NULL;
if(!::isDef(du->ctx, n, &iter_ptr.block_id, &iter_ptr.block_it, du->analysis_data, du->decl_map, du->block_map, &stmt))
throw NotDefExcpetion(stmt); // there is no definition for the DeclRefExpr passed by the user
// this means he trying to list the uses of a declrefexpr which is a use!
// an excpetion is thrown.
node = &(*(iter_ptr.block_it));
inc(true);
}
bool DefUse::uses_iterator::operator!=(uses_iterator const& iter){
return !(iter_ptr.block_id == iter.iter_ptr.block_id && iter.iter_ptr.block_it == iter.iter_ptr.block_it);
}
DeclRefExpr const* DefUse::uses_iterator::operator*() {
assert(iter_ptr.block_it->getKind() == DefUseNode::VarRef);
return iter_ptr.block_it->getVarRef();
}
DefUse::uses_iterator& DefUse::uses_iterator::inc(bool first){
if(iter_ptr.block_id == -1) return *this;
DefUseBlock const& m = (*du->analysis_data)[iter_ptr.block_id];
DefUseBlock::const_iterator it = m.find(node->getDecl());
assert(it != m.end());
if(iter_ptr.block_it != it->second.end())
iter_ptr.block_it++;
if(iter_ptr.block_it != it->second.end() && iter_ptr.block_it->getDecl() == node->getDecl()){
if(iter_ptr.block_it->getUse() == DefUseNode::Use)
return *this; // this is a use of the variable
else if(inDefBlock){
// found a new definition, stop iterating
iter_ptr.block_it = DefUseVect::const_iterator();
iter_ptr.block_id = -1;
return *this;
}
}
// if variable in defsout we have to check successor blocks
if(inDefBlock && std::find(m.defsout.begin(), m.defsout.end(), node) == m.defsout.end()){
iter_ptr.block_it = DefUseVect::const_iterator();
iter_ptr.block_id = -1; // it means there are no other uses of this variable
return *this;
}
// We have to change block
// the following code has the precondition the blocks are numbered in a decrescent order
// we lock into the block_id and see if the decl is in the reaches() set or not
int new_block = iter_ptr.block_id;
while(--new_block >= 0){
DefUseBlock const& m = (*du->analysis_data)[new_block];
if(std::find(m.reaches.begin(), m.reaches.end(), node) != m.reaches.end()){
inDefBlock = false;
DefUseBlock::const_iterator data_it = (*du->analysis_data)[new_block].find(node->getDecl());
if(data_it != (*du->analysis_data)[new_block].end())
if(data_it->second.begin()->getUse() == DefUseNode::Use){
iter_ptr.block_id = new_block;
iter_ptr.block_it = data_it->second.begin();
return *this;
}
}
}
iter_ptr.block_it = DefUseVect::const_iterator();
iter_ptr.block_id = -1;
return *this;
}
//===------------------------- defs_iterator -------------------------===//
DefUse::defs_iterator::defs_iterator(DefUse const* du, DeclRefExpr const& n):
iterator_impl(du), iter_ptr(-1), blockDef(false)
{
if(!::isUse(du->ctx, &n, &iter_ptr.block_id, &iter_ptr.block_it, du->analysis_data, du->block_map))
throw NotUseExcpetion(&n); // the DeclRefExpr passed by the user is never used in this block but only defined,
// the user is trying to iterate through the definitions of a definition... which doesn't make sense
// an exception must be thrown
iter_ptr.reaches_it = (*du->analysis_data)[iter_ptr.block_id].reaches.begin();
node = &(*(iter_ptr.block_it));
inc(true);
}
bool DefUse::defs_iterator::operator!=(defs_iterator const& iter){
return !(iter_ptr.block_id == iter.iter_ptr.block_id &&
iter.iter_ptr.block_it == iter.iter_ptr.block_it &&
iter.iter_ptr.reaches_it == iter_ptr.reaches_it);
}
DefUseNode const* DefUse::defs_iterator::operator*() {
DefUseBlock::const_iterator it = (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
if(blockDef)
return &(*iter_ptr.block_it);
return *iter_ptr.reaches_it;
}
DefUse::defs_iterator::defs_iterator& DefUse::defs_iterator::inc(bool first){
if(iter_ptr.block_id == -1) return *this;
// look inside reaches vector
DefUseBlock::const_iterator it = (*du->analysis_data)[iter_ptr.block_id].find(node->getDecl());
assert(it != (*du->analysis_data)[iter_ptr.block_id].end());
if(first){
// we have to find a def in the block, if exist
for(; iter_ptr.block_it >= it->second.begin(); iter_ptr.block_it--)
if(iter_ptr.block_it->getUse() == DefUseNode::Def){
blockDef = true;
break;
}
// the def is in the block, so it means we don't have to look any further
// next time the iterator is incremented we return an empty one
if(blockDef)
return *this;
}else if(blockDef){
iter_ptr.block_it = DefUseVect::const_iterator();
iter_ptr.reaches_it = VarRefsVect::const_iterator();
iter_ptr.block_id = -1;
return *this;
}
// we didn't find a def in the block, so we have to look at the reaches set
VarRefsVect const& reaches_v = (*du->analysis_data)[iter_ptr.block_id].reaches;
if(!first && iter_ptr.reaches_it != reaches_v.end())
iter_ptr.reaches_it++;
for(; iter_ptr.reaches_it != reaches_v.end(); iter_ptr.reaches_it++)
if((*iter_ptr.reaches_it)->getDecl() == node->getDecl())
return *this;
iter_ptr.block_it = DefUseVect::const_iterator();
iter_ptr.reaches_it = VarRefsVect::const_iterator();
iter_ptr.block_id = -1;
return *this;
}
DefUse::uses_iterator defuse::DefUse::uses_begin(clang::DeclRefExpr const* var) const{
return DefUse::uses_iterator(this, DefUseNode(var, DefUseNode::Def));
}
DefUse::uses_iterator defuse::DefUse::uses_begin(clang::VarDecl const* var) const{
return DefUse::uses_iterator(this, DefUseNode(var));
}
DefUse::uses_iterator defuse::DefUse::uses_end() const{ return DefUse::uses_iterator(); }
DefUse::defs_iterator defuse::DefUse::defs_begin(clang::DeclRefExpr const* var) const{
return DefUse::defs_iterator(this, *var);
}
DefUse::defs_iterator defuse::DefUse::defs_end() const{ return DefUse::defs_iterator(); }
bool defuse::DefUse::isUse(clang::DeclRefExpr const* var) const { return ::isUse(ctx, var, NULL, NULL, analysis_data, block_map); }
bool defuse::DefUse::isDef(DefUseNode const& n) const { return ::isDef(ctx, n, NULL, NULL, analysis_data, decl_map, block_map); }
//===------------------------- BuildDefUseChains -------------------------===//
DefUse* defuse::DefUse::BuildDefUseChains(Stmt* body, ASTContext *ctx, DefUseBasicHelper* helper,
CFG* cfg, ParentMap* pm, bool verbose, llvm::raw_ostream& out)
{
bool cleanCFG = false, cleanParentMap = false, cleanHelper = false;
if(!cfg){
cfg = CFG::buildCFG(body, ctx);
cleanCFG = true;
}
if(!pm){
pm = new ParentMap(body);
cleanParentMap = true;
}
DefUseData* data = new DefUseData(cfg->getNumBlockIDs());
VarDeclMap* dm = new VarDeclMap(body); // map VarDecl to AST nodes - this is needed as when the CFG is created by CLANG
// DeclStmt with several declarations are rewritten into separate DeclStmts
VarRefBlockMap* bm = new VarRefBlockMap; // map each VarRef to the CFG block where it has been used
if(verbose)
cfg->dump(ctx->getLangOptions());
if(!helper){
helper = new DefUseBasicHelper;
cleanHelper = true;
}
helper->InitializeValues(data, bm, dm);
for(CFG::const_iterator it = cfg->begin(); it != cfg->end(); it++)
helper->VisitCFGBlock(*it, cfg->getNumBlockIDs()-1);
You should not rely on the root being number 'getNumBlockIDs() - 1'. This actually is not an invariant you should depend on. If you want to identify the entry block, use 'CFG::getEntry()'.
// remove from defsout local variables
RemoveLocalDefs(*cfg, *pm, *dm, *data);
// Calculates the reaches set for each block of the CFG
ComputeReaches(*cfg, *data);
if(verbose)
printDefUseData(out, *ctx, *data);
unsigned block_id = cfg->getNumBlockIDs();
if(cfg && cleanCFG)
delete cfg;
if(pm && cleanParentMap)
delete pm;
if(helper && cleanHelper)
delete helper;
return new DefUse(*ctx, data, dm, bm, block_id);
}
Instead of BuildDefUseChains building its own CFG and ParentMap, have these be passed in from the client. This allows reuse of existing data that has likely been constructed for other uses. You can also make an alternate version of BuildDefUseChains that is passed an AnalysisManager and uses those to create the CFGs and ParentMap.
Also, consider using llvm::OwningPtr to auto-reclaim 'helper' and friends. For example:
llvm::OwningPtr<DefUseBasicHelper> helperReclaim;
if (!helper) {
helper = new DefUseBasicHelper();
helperReclaim.reset(helper);
}
You can then forget about the later cleanup because it will automatically happen when the function returns. This idiom is really powerful when you have multiple return sites, as you can easily forget to do the proper cleanup. It also cleans up the code, as all the cleanup is just boilerplate that is easily handled by the OwningPtr.
//===------------------------- DefUseChainTest -------------------------===//
void DefUseChainTest::HandleTopLevelDecl(DeclGroupRef declRef) {
for(DeclGroupRef::iterator it = declRef.begin(); it < declRef.end(); it++){
if((*it)->getKind() != Decl::Function)
continue;
Use 'dyn_cast' or 'isa' instead of 'getKind()'
// Top-level Function declaration
FunctionDecl* func_decl = cast<FunctionDecl> (*it);
Stmt* func_body = func_decl->getBodyIfAvailable();
if(!func_body)
continue;
out << "--- Building Def-Use Chains for function '" << func_decl->getNameAsString() << ":" << func_decl->getNumParams() << "' ---\n";
du = DefUse::BuildDefUseChains(func_body, ctx, 0, 0, 0, true);
out << "------ Testing Def-Use iterators ------\n";
VisitStmt(func_body);
out << "**************************************************\n";
delete du;
}
}
void defuse::PrintVarDefs(DefUse const* DU, DeclRefExpr const* DR, ASTContext& ctx, llvm::raw_ostream& out){
if(DR->getDecl()->getKind() != Decl::Var && DR->getDecl()->getKind() != Decl::ParmVar)
return;
Use 'dyn_cast' or 'isa' instead of 'getKind()'
VarDecl const* VD = cast<VarDecl>(DR->getDecl());
SourceLocation loc = DR->getLocStart();
out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
<< "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ") -> defs_iterator:\n";
try{
for(DefUse::defs_iterator it = DU->defs_begin(DR); it != DU->defs_end(); it++){
SourceLocation loc;
if((*it)->getKind() == DefUseNode::VarRef)
loc = (*it)->getVarRef()->getLocStart();
else
loc = (*it)->getVarDecl()->getTypeSpecStartLoc();
out << "\t* Found DEF in loc: " << "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ")\n";
}
}catch(NotUseExcpetion e){
SourceLocation loc = e.expr->getLocStart();
out << "ERROR: Variable " << PrintClangStmt(e.expr, ctx) << " in location : "
<< "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ") is not a use for variable: " << VD->getNameAsCString() << "\n";
Again, LLVM doesn't use C++ exceptions. You'll need to come up with an alternate mechanism.
}
}
void PrintVarUses(DefUse const* DU, DefUseNode const& n, ASTContext& ctx, llvm::raw_ostream& out){
VarDecl const* VD = n.getDecl();
SourceLocation loc = VD ->getTypeSpecStartLoc();
if(n.getKind() == DefUseNode::VarRef)
loc = n.getVarRef()->getLocStart();
out << "Variable ref '" << VD->getNameAsCString() << "' in LOC: "
<< "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ") -> uses_iterator:\n";
try{
DefUse::uses_iterator it = DU->uses_end();
if(n.getKind() == DefUseNode::VarRef)
it = DU->uses_begin(n.getVarRef());
else
it = DU->uses_begin(VD);
for(; it != DU->uses_end(); it++){
SourceLocation loc = (*it)->getLocStart();
out << "\t* Found USE in loc: " << "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ")\n";
}
}catch(NotDefExcpetion e){
SourceLocation loc = e.stmt->getLocStart();
out << "ERROR: Variable " << PrintClangStmt(e.stmt, ctx) << " in location : "
<< "(" << Line(loc, ctx.getSourceManager()) << "," << Column(loc, ctx.getSourceManager()) << ") is not a definition for variable: " << VD->getNameAsCString() << "\n";
}
Again, LLVM doesn't use C++ exceptions. You'll need to come up with an alternate mechanism.
}
inline void defuse::PrintVarUses(DefUse const* DU, DeclRefExpr const* DR, ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU, DefUseNode(DR), ctx, out); }
inline void defuse::PrintVarUses(DefUse const* DU, VarDecl const* VD, ASTContext& ctx, llvm::raw_ostream& out){ ::PrintVarUses(DU, DefUseNode(VD), ctx, out); }
void defuse::DefUseChainTest::VisitDeclRefExpr(DeclRefExpr *DR){
if(DR->getDecl()->getKind() == Decl::Var || DR->getDecl()->getKind() == Decl::ParmVar){
Use 'dyn_cast' or 'isa' instead of 'getKind()'
PrintVarUses(du, DR, *ctx, out);
PrintVarDefs(du, DR, *ctx, out);
}
}
void defuse::DefUseChainTest::VisitDeclStmt(DeclStmt *DS){
for(DeclStmt::decl_iterator it = DS->decl_begin(); it != DS->decl_end(); it++){
if((*it)->getKind() != Decl::Var)
continue;
VarDecl* VD = cast<VarDecl>(*it);
PrintVarUses(du, VD, *ctx, out);
if(VD->getInit())
Visit(VD->getInit());
How about:
if (Stmt *S = VD->getInit())
Visit(S);