RFC: Where to put new tool, related to C++11 Migrator, that has external deps

Hi all,

I've been playing with putting together a prototype to merge changes made to headers from the points of view of different translation units by the C++11 Migrator. Each translation unit sees a clean, un-transformed copy of every header it uses and changes made to source that happens to live in a header is saved to disk so that it doesn't disrupt the original file. The problem is then merging all these changes after the migrator is done. Since the merging of changes is rather crucial to the migrator being able to change headers I thought I'd speed things up somewhat and try using SCM to do the merging for me. So the prototype I'm building uses libgit2 (http://libgit2.github.com/).

So far, I've stored this tool on github (https://github.com/revane/libgit) but I'm curious if it has a place in clang-tools-extra. I know the third-party dependency is likely to be a sticking point so I thought I'd get the opinions of the community.

Manuel, didn’t you guys at google deal with this issue with protobuf + mapreduce? Do you guys have a simple way of merging the changes?

– Sean Silva

+djasper, who has been doing some work in that area

we definitely do not use any merge tools; they’re imo mainly useful if you have unknown other changes, but in our case everything is known. Simply applying everything on a single Rewriter should do exactly what you want - the main problem is:

  • find a good on-disk representation
  • have a small program to slurp it in and apply the changes, if possible with multi-threading

The reason we haven’t contributed any of that upstream yet is:

  • need to learn how to do multi-threading with LLVM libs
  • need a good way to store the intermediate representation (probably should be YAML)


I basically don’t have much to add to what Manuel said. Just additional thoughts why merge-tools won’t be a generic solution:

  1. You can’t apply overlapping edits. With mergetools the lines can’t overlap. If you know the refactoring more precisely, you can get away with the byte-ranges not overlapping. I don’t consider this a must-have, but it is definitely a disadvantage of merge-tools. However, the important fact stands: You cannot apply overlapping edits either way.
  2. A single refactoring step (e.g. turning a normal for-loop into a range-based for loop) touches code at multiple places.
  3. The conclusion of #1 and #2 together is that you need to guarantee that a refactoring step is done atomically (either completely or not at all) and I don’t see a way to do that with merge tools.