Requesting Help Understanding ASTs and LibTooling

Hi everyone,

I am just starting out with Clang and working for a project where I am trying to generate and compare ASTs to measure code similarity.

I figured around a way using Clang to get the AST, but it turns out I am expected to use Libtooling. Anyone have an example where I can parse C++ code to generate an AST using Libtooling. Also, would love design suggestions regarding comparing tree similarity. any examples with explanations would be really helpful.

Thanking you.

Sincerely,
Saurav

Hi everyone,

I am just starting out with Clang and working for a project where I am
trying to generate and compare ASTs to measure code similarity.

Welcome, good luck with your project!

I figured around a way using Clang to get the AST, but it turns out I am
expected to use Libtooling.

There are actually different interfaces available to interact with
Clang's AST. Which one is best depends on your use case. I'd recommend
looking at [1] to make an informed decision.

Anyone have an example where I can parse C++
code to generate an AST using Libtooling.

Yes; clang's documentation [2] contains an excellent tutorial. There
are also many blog post available that will get you started with
concrete examples, including one I wrote myself when I started looking
at the clang AST [3].

Also, would love design
suggestions regarding comparing tree similarity. any examples with
explanations would be really helpful.

Thanking you.

Sincerely,
Saurav

_______________________________________________
cfe-dev mailing list
cfe-dev@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

[1] http://clang.llvm.org/docs/Tooling.html
[2] http://clang.llvm.org/docs/LibTooling.html
[3] Understanding the Clang AST)

-- Jonas

Hi,

If you’re interested in libTooling, you can check ESBMC [0], more specifically our clang_c_converter [1].

We use libTooling to generate the AST and we implement our own AST traverser code. Currently, it only works for C programs; in the future we’ll release the C++ code.

Thanks,

[0] https://github.com/esbmc/esbmc
[1] https://github.com/esbmc/esbmc/blob/master/clang-c-frontend/clang_c_convert.cpp

Hi,

I have decided to use libclang because I feel like it would suit my purpose more. I had run-time concerns with the python bindings but the C bindings are working fine.

I still have a few more questions/concerns. Please see the README on https://github.com/Saurav-K-Aryal/libclang-help for details.

The questions are also listed on the top of the file as comments.

Any help would be appreciated.

Thanking you.

Sincerely,
Saurav

Hi,

I can answer some of your questions:

- Does libclang get and parse the entire source file? Yes, it will
consider the whole translation unit.
- Get file location data and skip traversing includes? Use
clang_Location_isFromMainFile(clang_getCursorLocation(cursor))
- Get the cursor kind as string? Use clang_getCursorKindSpelling(cursor_kind)

Regards,

Jonas Devlieghere

Thank you, Mr. Devlieghere!

Now, I want to know what other information I can extract from the AST. I have cursorName (or, spelling) and cursorKind spelling. Any other functions that gives me information or just could use to help compare AST nodes. Also, anyway to serialize the AST after generating it or convert it to a format that is reusable afterwards.

Again, the code can be seen on https://github.com/Saurav-K-Aryal/libclang-help.

Sincerely,
Saurav