Tooling: how to deal with headers?

Hi Tooling folks, I have a quick question about making changes to
headers with a clang tool.

Suppose the following basic setup:

- I have a compile_commands.json for a project.
- I have a tool that renames class Foo to Bar.

So if Foo appears in headers, how do occurrences of Foo in headers get renamed?

To my understanding, the compile_commands.json only tracks the ".cpp"
files that are compiled, rather than each source file in the project.
This makes me wonder:

- How will a tool will ensure that Foo only gets renamed once if it is
in a header that is included multiple times?

- Once Foo gets renamed in a header, how will the other files that
have yet to be processed still see the old name (the new name would
cause a compilation failure since the references in the other ".cpp"
file will not have yet been renamed).

Is there a "standard" way to deal with this?

-- Sean Silva

Hi Tooling folks, I have a quick question about making changes to
headers with a clang tool.

Suppose the following basic setup:

  • I have a compile_commands.json for a project.
  • I have a tool that renames class Foo to Bar.

So if Foo appears in headers, how do occurrences of Foo in headers get renamed?

I believe the approach for some current tools is to compute all the edits over the entire codebase, considering all the contents in each TU. Then take all those edits and deduplicate them (as well as detecting conflicts) before applying them.

Hi Tooling folks, I have a quick question about making changes to
headers with a clang tool.

Suppose the following basic setup:

- I have a compile_commands.json for a project.
- I have a tool that renames class Foo to Bar.

So if Foo appears in headers, how do occurrences of Foo in headers get
renamed?

To my understanding, the compile_commands.json only tracks the ".cpp"
files that are compiled, rather than each source file in the project.
This makes me wonder:

- How will a tool will ensure that Foo only gets renamed once if it is
in a header that is included multiple times?

If you use the RefactoringTool, adding Replacement's will auto-deduplicate.

- Once Foo gets renamed in a header, how will the other files that
have yet to be processed still see the old name (the new name would
cause a compilation failure since the references in the other ".cpp"
file will not have yet been renamed).

Use RefactoringTool and make all your replacements in a single tool-run. It
will apply all changes in the very end.

Cheers,
/Manuel

If you use the RefactoringTool, adding Replacement's will auto-deduplicate.

Ah cool.

Use RefactoringTool and make all your replacements in a single tool-run. It
will apply all changes in the very end.

For some reason, I was thinking a separate process being run for each
TU, and since I wasn't aware of any kind of centralized
serialization/DB logic for the edits which would then be able to
deduplicate, but I guess they are all just in the same process and are
stored in memory. Is peak resident memory O(#replacements) then? What
kind of memory usage for replacements have you seen in practice?

-- Sean Silva

For llvm-sized projects I think that's not going to be a problem (or you
have very old hardware :P)
For Google-sized projects (> 100MLOC) you'll want to serialize the
replacements in some form (perhaps from a MapReduce or similar :),
deduplicate them, and finally load and apply them. We're currently using
protocol buffers internally for those formats, so until a dependency to
protocol buffers is fine we'll not put that into the main repo.

Cheers,
/Manuel