Debug information on multiple files

Hi,

I'm trying to compile two files together with debug information but
seems that LLVM is getting the DW_AT_stmt_list wrong when ld is
linking the final executable.

Originally, I tried on ARM with Clang (+llc+gas+ln) and, the object
files, the DW_AT_stmt_list were null, as expected. When linking, they
should point to the offset in the line table, but all of them are
still null, so pointing all to the same line table (whichever object
gets linked first).

I then tested with SVN Clang+LLVM and got the same problem for x86-64.
In GDB, when you're stepping through the program, whichever file got
linked first has line information and you can only step on the source
of the first object.

My Files:

-------------------------------- ext.c
#include <stdio.h>

int external_fn(void)
{
  int i;
  for (i = 0; i < 5; ++i) {
    printf("ext fn #%d\n", i);
  }
  return 0;
}

-------------------------------- main.c
#include <stdio.h>

int external_fn(void);

int main(int argc, char** argv)
{
  printf("TestMsg\n");
  external_fn();
  return 0;
}

$ clang -g main.c ext.c
$ gdb -q a.out
(gdb) start
Temporary breakpoint 1 at 0x4004c0: file main.c, line 7.
Starting program: /work/temp/a.out

Temporary breakpoint 1, main (argc=1, argv=0x7fffffffbf18) at main.c:7
7 printf("TestMsg\n");
(gdb) n
TestMsg
8 external_fn();
(gdb) s
ext fn #0
ext fn #1
ext fn #2
ext fn #3
ext fn #4
9 return 0;
(gdb)

If I compile ext.c first, main has no line information:

$ clang -g ext.c main.c
$ gdb a.out
(gdb) start
Temporary breakpoint 1 at 0x4004e4
Starting program: /work/temp/a.out

Temporary breakpoint 1, 0x00000000004004e4 in main (argc=0, argv=0x7fffffffbf10)
(gdb) n
Single stepping until exit from function main,
which has no line number information.
TestMsg
ext fn #0
ext fn #1
ext fn #2
ext fn #3
ext fn #4
0x00000033c5e1d974 in __libc_start_main () from /lib64/libc.so.6
(gdb)

What am I missing?

See “DwarfDebug problem with line section” thread on llvmdev. Bottom line, we may need a target specific patch for targets that do not follow dwarf standard (as per my reading) in this particular case.

Hi Devang,

Ok, got the background, but will reply on this email.

As far as I understood, there is nothing wrong with the way clang
deals with stmt_list. The encoding is correct (AFAIK) and the offsets
are always zero in every object file, as there is only one line table
and one debug_info. The problem is during link time. GCC also
generated zero for stmt_list in the objects but links correctly in the
end.

I still haven't figured out which type of relocation is needed. All I
know is that GCC generates .rel.debug_info relocations that clang
doesn't:

clang:

** Section #6 '.rel.debug_info' (SHT_REL)
    Size : 40 bytes (alignment 4)
    Symbol table #22 '.symtab'
    5 relocations applied to section #5 '.debug_info'

GCC:

** Section #7 '.rel.debug_info' (SHT_REL)
    Size : 184 bytes (alignment 4)
    Symbol table #22 '.symtab'
    23 relocations applied to section #6 '.debug_info'

I'll dig deeper and see if I can spot what should be done.

cheers,
--renato

I’ve also been looking at debugging with ELF and noticed the same problem as Renato. I just sent a patch to llvmcommits that fixes the problem. DW_at_stmt_list needs to emit a label(and therefore a relocation) for the offset rather a constant 0, then the linker can fixup the offset as it shuffles object files around.

Krister

Are you taking into account the relocation information? Use "objdump -Dr" on an the object file compiled by gcc -gdwarf-2, and you will see something like this (on x86-64):

0000000000000000 <.debug_info>:
    0: aa stos %al,%es:(%rdi)
    1: 00 00 add %al,(%rax)
    3: 00 02 add %al,(%rdx)
    5: 00 00 add %al,(%rax)
                         6: R_X86_64_32 .debug_abbrev
    7: 00 00 add %al,(%rax)
    9: 00 08 add %cl,(%rax)
    b: 01 00 add %eax,(%rax)
                         c: R_X86_64_32 .debug_line

Note the R_X86_64_32 relocations. At link time, these relocations will be replaced by the absolute address of the symbol/section they point to (limited to 32 bits), added to the value stored in the object file.

With Clang, these relocations will be missing and hence the linker will just copy the data.

Jonas

It's not just DW_at_stmt_list, you have to do the same for the pointer to the abbreviation section.

Jonas

GCC seems to limit it to 32 bits as well, I guess there aren't many
cases with line tables bigger than 4G... :wink:

I had problems with relocation in the past (typeinfo, exceptions) in
clang on ARM, and sent a patch to add some of it to the AsmPrinter,
but that was a dirty work-around the issue.

What is the official mechanism in MC to treat relocations? Where
should I add target-specific relocation types and how to write them in
ASM or ELF?

cheers,
--renato

If it’s the offset in the DI header then I’m pretty sure that’s already taken care of.

Krister

Hi Krister,

We've applied your patch locally and it solved our problem, thanks!

Can somebody review/commit that before 2.8 is branched?

cheers,
--renato

Ok. I will review.

Devang

... but, as it was discussed earlier, it breaks Darwin.
I applied a patch to handle DW_AT_stmt_list using target hook (based on original patch by Artur Pietrek). r112678. Please try it and let me know if it works. If it works, please prepare a patch for debug_range using similar target hook approach.

Hi Devang,

The patch works fine for us, too. Thanks!

Hi Devang,

The patch works fine for us, too. Thanks!

Hi,

we're writing a new target backend (based on the c backend) for the llvm. We need to pass case sensitive strings from the command line to the backend (unix paths). Currently we're using the Subtarget features string to relay the information.
It turns out that the llvm applies a to_lower to the this string, so unix paths become basically useless.
Is there a better way to pass this information to the backend. If not, what are the chances that a patch that disables the conversion to lower case for this string is accepted?

Thx,
Alex