Way to compare two Function bodies.

Hi,

I need to compare two versions of the same clang::FunctionDecl function body(a function in C). I would need to compare them statement by statement and say if the two versions are the same. Is there a neat way to do this in the clang framework?

Thanks,
Pavan

Not sure about Clang, but did you think about GNU indent and diff? :slight_smile:

This question is too vague to answer: what do you mean by “the same”? Same source text, same tokens, same tokens other than variable naming, something else?

By “same”, I mean, their ASTs to be the same to start with. I am working on a tool, which is used to syntactic/semantic similarity between two versions of the code base. So far, I have been able to get the FunctionDecls of both versions of a function into memory. But I am not sure how to compare them.

By “same”, I mean, their ASTs to be the same to start with. I am working on a tool, which is used to syntactic/semantic similarity between two versions of the code base. So far, I have been able to get the FunctionDecls of both versions of a function into memory. But I am not sure how to compare them.

This is still a little imprecise. Should functions compare equal if variables have been renamed between them? What if an expression has been rewritten into a trivially equivalent form (for instance, parentheses were added or removed, or p->x was changed to (*p).x)?

Stmt::Profile does nearly what you want (it’s the mechanism we use to determine if two dependent expressions are equivalent for the purposes of template redeclaration matching), but it will treat

void f() { int n; }
void g() { int n; }

as having different bodies, because they do not declare the same variable. That’s probably not very hard to fix.

Yes, What you have mentioned is exactly what I want. So does it mean that Stmt::Profile would generate unique ID (llvm::FoldingSetNodeID) irrespective of it being a compound statement or not? If so, could I use it to compare the two functions using the IDs that are generated using

FunctionDecl::getBody()::profile() ?

Thanks,

Pavan

Yes, What you have mentioned is exactly what I want. So does it mean that Stmt::Profile would generate unique ID (llvm::FoldingSetNodeID) irrespective of it being a compound statement or not? If so, could I use it to compare the two functions using the IDs that are generated using

FunctionDecl::getBody()::profile() ?

Yes, you can Stmt::Profile to compare two function bodies. But as noted below, it won’t quite do what you want, because it doesn’t consider variables declared within the statement to be “the same”. You could fix this by teaching StmtProfiler to map declarations to some kind of declaration index if they are declared within the statement being profiled. You’d need similar treatment for LabelDecls. I don’t see a good reason why we wouldn’t accept such a patch for upstream Clang.

Thats pretty cool :slight_smile: I will check what could be done. Thanks a lot!!

~Pavan

Hi,

Could you please tell me what the idea behind the canonical form in general is?

Basically when I profile the statement with the canonical boolean set, It is internally calling

if(Canonical)

T = Context.getCanonicalType(T)

and while profilling,

ID.ADDPointer(T.getAsOpaquePtr())

is being done, which gives out different results for both the versions of the same function.

Also, in VisitDecl(), the profiler is adding :

ID.AddPointer(D->getCanonicalDecl());

which is again different for each version of the same function.

Thanks,
Pavan