[DebugInfo] Merging instruction locations of hoisted instructions

Hi,

I’am recently studying debug location updates in LLVM optimizations. Thanks for the community’s contribution to the great document “How to Update Debug Info”. However, a counter-example presented in the section “when to merge debug locations” confuses me. It says:

“Hoisting identical instructions which appear in several successor blocks into a predecessor block (see `BranchFolding::HoistCommonCodeInSuccs). In this case there is no single merged instruction. The rule for dropping locations applies.”

I referred to BranchFolding::HoistCommonCodeInSuccs, but I didn’t see any code for applying the “drop” rule, and the code directly hoist the instructions without updating the debug locations:

MBB->splice(Loc, TBB, TBB->begin(), TIB);

FBB->erase(FBB->begin(), FIB);

Besides, I think in BranchFolding::HoistCommonCodeInSuccs, each of the hoisted instructions is the merge of the erased instructions in FBB and the hoisted instructions in TBB. So, “there is no single merged instruction” is not totally correct. Instead, in my opinion, applying the merging update for each of the hoisted instructions would be better.

Now, I am confused and wondering whether I misunderstood this example.

1 Like

I believe you’re correct about that example; when we perform similar operations elsewhere, we merge DILocations. I’m not sure the reason for that counterexample - it wasn’t challenged in the original review, but as you’ve pointed out we are hoisting duplicate instructions from true/false blocks into their predecessor and merging them, so there’s a clear target for the merged location for each hoisted instruction. Besides that rule, it’s worth keeping in mind that the document is aspirational, not enforced, so there may be many cases in LLVM where the rules say one thing but a pass does another, potentially even when that pass is used as an example by the document.

There are also some considerations that are not mentioned in that document; as @jmorse noted on your review, we may want to keep DILocations for memory instructions even when the rules would suggest they be dropped, because of the importance of identifying the source expression responsible for a crash - this conflict is a result of a fundamental limit of the expressiveness of DILocation metadata. In summary, the document is probably in need of some updating!

2 Likes

Thanks for your helpful response!

As you said, the document do need some updating. So, I’am recently working on studying and summarizing the debug location update rules, hoping to give a more formal representation of the rules. Maybe I could post the result here when it’s done to discuss it and get opinions from you guys.

The other issue you mentioned—the conflict between the proper debug location update and the backtrace quality—is a also an interesting topic that can be studied in the future. (So, my PR won’t be accept unless the conflict could be properly addressed, right? T-T)

I referred to BranchFolding::HoistCommonCodeInSuccs, but I didn’t see any code for applying the “drop” rule, and the code directly hoist the instructions without updating the debug locations :

This indeed doesn’t seem right – perhaps it only ever applies to code created by TailDuplicator? (This is a guess).

The other issue you mentioned—the conflict between the proper debug location update and the backtrace quality—is a also an interesting topic that can be studied in the future. (So, my PR won’t be accept unless the conflict could be properly addressed, right? T-T)

I think @dblaikie had a good suggestion of seeking source locations in the preheader and using those. While it might not be ideal, it’s a reasonable stepping behaviour and avoids completely dropping source location information most of the time.

Blockquote I think @dblaikie had a good suggestion of seeking source locations in the preheader and using those. While it might not be ideal, it’s a reasonable stepping behaviour and avoids completely dropping source location information most of the time.

@jmorse Thanks for the reminder! I’ll check it later.

By the way, I notice the RemoveDI project you guys are working on, which aims to remove debug intrinsics related to debug value (according to what I learned so far). I wonder whether there will be a project to rebuild or refine the debug location system, which may not be expressive enough now, in the future?

There may be some kind of extension to debug locations in future, but at the moment it’s not clear whether doing so would preserve enough additional information for the development and performance costs to be worth it - it’s something I’m currently looking into though!

If we do work out a consensus for this case, it would be great to record that as an additional example on the update debug info rules page for the next person.

If we do work out a consensus for this case, it would be great to record that as an additional example on the update debug info rules page for the next person.

I’m willing to do this. I’ll give a patch for updating the document soon.

1 Like

FWIW, at the moment - my personal perspective is that loads/stores don’t merit a special carveout away from the merging rules. The rules/reasons apply equally to them as any instruction.

The code mentioned in the original example appears (at least from the small splice+erase snippet given above) not to have made an intentional choice about updating debug info - so it doesn’t seem like a very strong signal that anyone explicitly wanted the behavior to diverge from the guidance (and even if there was such an explicit signal, I think we’d likely consider it a mistake/misfeature/misstep that maybe seemed like a good idea at the time, but in the context of our more recent understanding and formalism, one to go back and fix to be consistent with the guidance of never moving a location such that an unreachable location could become reachable)

1 Like