LibTooling vs. LibClang+Python for simple refactoring

Hi everyone,

For our research group, I need to get started with some relatively simple refactoring tasks in a modestly complex C++ project. The goal is to have a tool that (in the beginning) is good for at least two things:

  • renaming member variables/methods
  • replacing member variable access by getters/setters

After the extremely positive experiences I’ve had with clang (and since the IDE tools I tried failed for our project), I’d like to utilize the clang infrastructure for this task. However, in the very beginning I already face the problem of where I should start:

On the one hand, LibTooling seems to be the interface of choice when it comes to refactoring code (it’s one of the canonical examples on http://clang.llvm.org/docs/Tooling.html). On the other hand libclang seems to be easier to use, has Python bindings, and from the outset looks like it could be already sufficient for the task at hand.

Do you have any experience/recommendations for which library is the one to use in this case?

Regards,

Michael

Tooling, specifically the clang-fixit till is one place sheet efforts are being made to apply these kinds of refractoriness (specifically clang-fixit would apply predefined naming convention rules, rather than user specified renaming).

One of the things that tooling might help you with would be the necessary reformatting (clang-format) and build/source discovery.

As David says, LibTooling was made for this task - I myself don’t have a deep understanding of how easy those things would be to do with libclang - you might need to expose some interfaces that are not exposed yet (that might be a great contribution to the clang community if you happen to try that route :wink:

Cheers,
/Manuel

Hi David,

Thanks for the fast reply.

From your description I infer that clang-fixit is a tool for enforcing a certain naming convention, rather than changing specific names, or did I misunderstand you?

However, maybe it could be a place for me to start looking for ideas… The only reference I found to clang-fixit is http://llvm-reviews.chandlerc.com/D51, but nothing in the llvm or clang/extra repository. Could you tell me where I can find more information on it?

Regards,

Michael

Hi Manuel,

Thanks for your fast answer.

From what you’re saying it sounds like LibTooling is indeed the place for me to start. I don’t want to reinvent the wheel by doing stuff with LibClang for which it wasn’t made, so maybe I have to swallow the red pill and dig into LibTooling. I will, however, take a final look at LibClang - if the only parts that are missing are the rewriting functions, doing it in Python might still deliver first results faster…

Having said that, do you know of any efforts to have LibTooling bindings for Python as well? From the tools documentation website I understood that since LibTooling has not (yet?) reached a stable interface, no one bothered to start with one, but you never know before you ask :wink:

Regards,

Michael

In article <9A22DC3C-F2D5-4081-A593-C6E2F022DD08@fz-juelich.de>,
    "Schlottke, Michael" <m.schlottke@fz-juelich.de> writes:

Do you have any experience/recommendations for which library is the one to
use in this case?

I spiked a refactoring tool to replace (void) argument list to
function signatures with an empty argument list ().

I looked at libclang, but it's a C interface and I would prefer to
deal with C++ and not C.

It was very easy to get this going with libtooling. I was inspired by
watching a Chandler Carruth talk about libtooling. My solution ended
up being about 300 lines of code and is pretty straightforward. I
used remove-cstr-calls as my starting point and went from there. Not
only was it not much code, but I was able to make progress quickly by
starting with simple cases and working my way up to more complicated
ones.

I've used this example as the basis of a proposed talk for C++ Now!
2014. Hopefully that talk will be accepted; otherwise it will turn
into a blog post and/or video tutorial.

I'm not sure if this list accepts attachments, so I'll provide links
instead.

Here is my test input to my remove-void-args tool:
<http://user.xmission.com/~legalize/tmp/clang/test-orig.cpp>

And here is the result after applying the tool:
<http://user.xmission.com/~legalize/tmp/clang/test-after.cpp>

Since you're interested in doing a Rename refactoring, here is a test
suite I wrote up for evaluating C++ refactoring tools. There are many
test cases in there for Rename of symbols. Why reinvent the wheel?
Please use it and give me any feedback on ways it could be improved:
<http://d3dgraphicspipeline.codeplex.com/releases/view/39824>

Hi Manuel,

Thanks for your fast answer.

From what you're saying it sounds like LibTooling is indeed the place
for me to start. I don't want to reinvent the wheel by doing stuff with
LibClang for which it wasn't made, so maybe I have to swallow the red pill
and dig into LibTooling. I will, however, take a final look at LibClang -
if the only parts that are missing are the rewriting functions, doing it in
Python might still deliver first results faster...

Having said that, do you know of any efforts to have LibTooling bindings
for Python as well? From the tools documentation website I understood that
since LibTooling has not (yet?) reached a stable interface, no one bothered
to start with one, but you never know before you ask :wink:

As far as I know nobody is working on python bindings for LibTooling - as
we're an open source project, we need to find somebody for whom it's high
enough on the prio list to contribute :wink:

Cheers,
/Manuel

Hi David,

Thanks for the fast reply.

From your description I infer that clang-fixit is a tool for enforcing a certain naming convention, rather than changing specific names, or did I misunderstand you?

That’s correct - but it has the same challenges in terms of resizing names, etc.

However, maybe it could be a place for me to start looking for ideas… The only reference I found to clang-fixit is http://llvm-reviews.chandlerc.com/D51, but nothing in the llvm or clang/extra repository. Could you tell me where I can find more information on it?

Hmm, perhaps it hasn’t really been started yet. Sorry for the red herring.

I would guess clang-migrate would be the next best tool to look at. It does various code changes and deals with reformatting the resulting code, etc.

  • David

Hi David,

Thanks for the fast reply.

From your description I infer that clang-fixit is a tool for enforcing
a certain naming convention, rather than changing specific names, or did I
misunderstand you?

That's correct - but it has the same challenges in terms of resizing
names, etc.

However, maybe it could be a place for me to start looking for ideas...
The only reference I found to clang-fixit is
http://llvm-reviews.chandlerc.com/D51, but nothing in the llvm or
clang/extra repository. Could you tell me where I can find more information
on it?

Hmm, perhaps it hasn't really been started yet. Sorry for the red herring.

Do you maybe mean clang-tidy? It's in clang-tools-extra last I checked.

-- Sean Silva

Hi David,

Thanks for the fast reply.

From your description I infer that clang-fixit is a tool for enforcing a certain naming convention, rather than changing specific names, or did I misunderstand you?

That’s correct - but it has the same challenges in terms of resizing names, etc.

However, maybe it could be a place for me to start looking for ideas… The only reference I found to clang-fixit is http://llvm-reviews.chandlerc.com/D51, but nothing in the llvm or clang/extra repository. Could you tell me where I can find more information on it?

Hmm, perhaps it hasn’t really been started yet. Sorry for the red herring.

Do you maybe mean clang-tidy? It’s in clang-tools-extra last I checked

Indeed I did. Thanks for the correction!

Just my $0.02 here - using LibClang with Python bindings is a bit slow.

On a moderate sized source file, it can take several hundred milliseconds to parse.

If I understand correctly, this is because the parsing involves creating of many Python objects, to represent just about everything in your AST.

The size of the original source is actually not the issue, but the size of the preprocessed source quickly becomes very large with the inclusion of many headers.

For working one-file at a time, this is not a problem at all, however, I wrote something that uses the Python bindings to inject custom instrumentation code, during the build of a large project, and this slowed the build considerably.

Again, just for the sake of everyone thinking about this - I should have written that in C++ from the beginning.

Amit Margalit
IBM XIV - Storage Reinvented
XIV-NAS Development Team
Tel. 03-689-7774
Fax. 03-689-7230