Debug information for unconditional branches

Hi all,

Since revisions 128471 and 128513 (March 2011) Clang does not emit debug
information for unconditional branches anymore, for example when
branching from the end of an else{} to the continuation block
(CodeGen/CGStmt.cpp line 459 in r177848). Why exactly is this?

I'm running into this because I'm identifying loop latches based on
their lexical scope. Normally, loop latches always have debug metadata
attached to them. However, in some cases -simplifycfg can merge the loop
latch with a branch which has no debug metadata attached to it (due to
aforementioned reason) causing the final latch not to have debug
metadata either.

Specifically, I have this testcase:
  while (while_test()) {
    if (if_test()) {
      foo();
    } else {
      bar();
    }
  }

Clang emits the following bytecode for this:
  while.cond:
    %call = call zeroext i1 (...)* @while_test(), !dbg !9
    br i1 %call, label %while.body, label %while.end, !dbg !9
  while.body:
    %call1 = call zeroext i1 (...)* @if_test(), !dbg !10
    br i1 %call1, label %if.then, label %if.else, !dbg !10
  if.then:
    call void (...)* @foo(), !dbg !12
    br label %if.end, !dbg !14
  if.else:
    call void (...)* @bar(), !dbg !15
    br label %if.end
  if.end:
    br label %while.cond, !dbg !17
  while.end:
    ret i32 0, !dbg !18
Note that the unconditional branch at the end of if.else does not have
debug metadata attached to it.

When running -simplifycfg, LLVM merges if.else and if.end as follows:
  if.else:
    call void (...)* @bar(), !dbg !15
    br label %while.cond

The end result of all this is that the loop latch does not have any
metadata attached to it, not only breaking my loop identification, but
presumable also making it impossible to stop at the end of the while
body (something which was possible before running -simplifycfg).

How should this be fixed? Should -simplifycfg have support for
'properly' merging these statements, or should Clang emit debug
statements despite the branch being unconditional?

Sincerely,

I’m not necessarily against reverting r128471 and the restricting patch on it, or identifying cases when we might like to more accurately model line info in debug metadata.

I’m curious what you’re trying to do with using line number information to identify loop latches. Can you provide a bit of background on what you’re trying to do? It might make it easier to identify things that will help.

Thanks!

-eric

Hi Eric,

I'm curious what you're trying to do with using line number information
to identify loop latches. Can you provide a bit of background on what
you're trying to do? It might make it easier to identify things that
will help.

Well, the way I'm running into this is somewhat convoluted, so bear with
me. We're using Clang and LLVM to compile code with loops to be run on a
CGRA processor. We're marking these loops using OpenMP-style #pragmas,
and since our LLVM back-end is responsible for generating CGRA-code we
attach that pragma to the loop latch and try to make sure it survives
all of opt.

In order to test whether our pragmas properly reach the back-end, we
wrote a loop-pass identifying all loops in a program and printing out
whether they are annotated or not. This way we can first list all
(regular and annotated) loops in an unoptimized program, then optimize
the hell out of it and finally check whether all marked loops reached
the back-end. Using this we can easily generate test-cases in which an
optimization (or specific combinations thereof) killed our pragma.

In order to identify loops, I look at the debug metadata of the latch. I
picked this information because ensuring its availability is quite
similar to what we have to do anyway: making sure the cgra pragma (which
is represented by metadata as well) keeps attached to the loop latch.

However, although all this is very specific to our situation, I believe
that the result of this work (making sure loop latches retain their
metadata) can be of interest to other people.

For example, in the issue I raised in my mails yesterday, a loop latch
gets split by -simplifycfg only to be deduplicated again by
-loop-simpify, losing all metadata in the process. Currently I've fixed
this by 0) making sure Clang emits debug metadata for unconditional
branches, 1) patching -simplifycfg so it copies the latch metadata to
the new latches when removing a block, and 2) adding a check to
-loop-simplify so it copies over individual latch metadata (if
identical) to the newly created latch.
But this is not perfect either: when moving the metadata from a latch
being removed to the new latches, the metadata of those latches gets
removed itself (but since those latches are now actually responsible for
jumping to the loop header, debug information pointing to the end of the
loop seemed more correct to me).

If it would help I could attach the patch and/or explain it by means of
an example. However, since there might be a better way to tackle this, I
first wanted to get some feedback in before putting code on the list.

Thanks,