uniquely identifying names

Hi,

The Sun java compiler allows you to (from java) walk the AST and
investigate it. Each token is stored in an object. Each object has a
hash() method which uniquely identifies it.

Now I was wondering: can I do so with the LLVM tooling as well? I could
of course if I want to identify e.g. a function name just pick the line-
and column number and maybe include the function name itself as well but
that would constantly change when lines are added and/or removed.

Any suggestions?

regards,

Folkert van Heusden

There’s no structural identity of code in Clang that I know of - I know someone’s building a tool for doing structural similarity for things like plagiarism detection (I think there are some patches on the clang mailing list).

But if you only need identity within a single process, the pointer value of the pointer to any AST construct is a unique identity you can use.

(line/file/column isn’t sufficiently unique - you could have a file that is included under different macro situations and each time it defines a different function, but all those functions would appear to be defined on the same line/file of that included file - or a macro that defines multiple functions - both can be resolved by looking at the more complete location information (including macro locations, etc))

There’s no structural identity of code in Clang that I know of - I know someone’s building a tool for doing structural similarity for things like plagiarism detection (I think there are some patches on the clang mailing list).

  • Raphael, who is the GSoC student currently working on a similar problem. Raphael is upstreaming his patches into the clang static analyzer.

Maybe I could expand a name into its full name and use that.
e.g.:

namespace bla { class myclass { void mymethod() { } } }

then the full name of mymethod would be bla::myclass::mymethod would be
unique enough to me (including filename).
Can I somehow get this out of it?

Do you want to identify the same entity across a valid program’s various source files? Across changes to that program? (what changes?)

If you want to do the former, then producing the mangled name of the entity is probably what you want. (some part of the ABI code in Clang could give you that, I would assume - but not sure exactly where)

I don’t know which API you’re using, but clang::NamedDecl::getQualifiedNameAsString seems to do what you want.

Oh right, indeed.
Parser error (my brain) on my side: I looked for "fully q...".
Mea culpa.