How to deal with accidental directory tree deletes, downstream?

Hello fellow downstream residents,

I see that r358546 accidentally deleted an entire subtree, which was
reverted in r358552. This of course caused a big merge conflict in
our local repo, and internally we've been debating tactics for dealing
with it, hopefully without losing our original history.

Has anyone else handled this in a way that they are happy with? We
found a StackOverflow post that is potentially helpful:

Doing this on our local copy of the upstream repo means we would not
have an exact copy anymore, which seems like a Bad Idea.

Is there a smooth way to resolve the merge conflict that does *not*
delete our local tree? I suppose we can somehow not accept the
accidental deletes, and then when we run forward to r358552 it will
decide we already have those files and it will Just Work?

Tips and hints welcome.
Thanks,
--paulr

Hey Paul,
Maybe use a pull request type model for merging in upstream commits to your internal repository? I know it is not the cleanest method of merging upstream commits, but it would allow you to gate your upstream merges on a quick and dirty merge test and then you could alert an engineer to the merge problem before it hits your mainline branch? In the scenario you described below, r358546 would have come in on a separate branch and attempted to merge with master (assuming your main dev branch is called master). When the merge failed an engineer could have been notified and begun to work through the issue to see what the best course of action would be. I’m really over simplifying here, but the point is, when possible, test each commit coming from upstream in as much isolation as possible, prior to merging. I’m a big fan of pre-commit testing and of pull requests. If there is a way, which works for you to integrate automated pull requests into the upstream merging workflow, it may be of benefit. Hope this helps.

-Mike

Hi Paul,

This caused merge conflicts for us due to local changes in llvm/test/Transforms so was blocked from being merged. As we merge commit by commit, I manually resolved the conflict merging r358546 by keeping “ours” then restored the rest of test/Transforms (“git reset HEAD llvm/test/Transforms”, “git checkout llvm/test/Transforms”) apart from test/Transforms/LoopFusion which was supposed to be part of the commit (“git rm -r llvm/test/Transforms/LoopFusion”). When r358552 came to be merged (also a conflict as we still had local files which were different from the opensource files being re-added) I resolved in favour of “ours” again. This effectively “edits out” the accidental delete between r358546 and r358552. This means that the files still exist in our tree through this period even if they don’t in opensource so git blame etc. still work (for us locally). A slight discrepancy but hopefully not a problematic one.

Regards
Russ

I’m pretty sure you could use ‘git rebase’ then ‘git replace’ to graft an alternate history over that patch of commits. Perform your merges, then optionally remove the replacements?

FWIW I do apologize... I'm not entirely certain how that happened
since neither the commit nor the email for the commit had 5k files
being deleted, but I can imagine it was quite inconvenient.

-eric

Hi,

Isn't this a bit of a nuisance also directly on trunk and not only for
downstream repos?

If I do e.g.

  git blame test/Transforms/LoopUnroll/basic.ll

then every line in the file is attributed to the commit where the
testcases were added back again.

With

  gitk test/Transforms/LoopUnroll/basic.ll

older commits are shown as well so it's not like all history is gone,
but at least blame is messed up.

For our downstream clone, we didn't merge from trunk to our
development-branch while the testcases were gone, so we never got any
conflicts at all, and since the tests were later put back I think the
situation for us now is the same as being directly on trunk.

/Mikael

test/Transforms svn log has been lost in the same way as git blame - if there's any way that we can get both back it'd be very useful!

@echristo not blaming you, Bad Stuff happens, I managed to delete an
entire tree in an internal repo once.

From: llvm-dev [mailto:llvm-dev-bounces@lists.llvm.org] On Behalf Of Simon
Pilgrim via llvm-dev
Sent: Thursday, April 18, 2019 5:14 AM
To: llvm-dev@lists.llvm.org
Subject: Re: [llvm-dev] How to deal with accidental directory tree
deletes, downstream?

test/Transforms svn log has been lost in the same way as git blame - if
there's any way that we can get both back it'd be very useful!

Hi Simon,
We'd probably have to take down the repo, reconstruct the SVN history from
the git history, and then everybody in the world would have to rebase or
reclone to synchronize with the new history. I don't think one test tree
would be disastrous enough to warrant that sort of heroic measure, but it's
really up to the Foundation to figure that out.

IIUC the history is actually still there, just harder to find; you need to
checkout a branch prior to the mass delete, and the old history will be
there. It's just not directly accessible from HEAD.
--paulr