Problem adding GDB index when building LLVM

Dear all,

I have problems trying to add a GDB index to my built Clang+LLVM binaries. This results in GDB startup being extremely slow (minutes) when trying to debug Clang.

I am building LLVM with Clang, using the upstream LLVM source code from around January this year. I am using the preinstalled GCC 11.2 on my Ubuntu 22.04 and GNU gold 1.16 to link, using -DLLVM_USE_LINKER=gold at CMake configuration time.

I am building with -DLLVM_BUILD_LLVM_DYLIB=On and -DLLVM_LINK_LLVM_DYLIB=On for development purposes (but the problem does not change when not doing so). I am building in Debug mode with LLVM_USE_SPLIT_DWARF.

As far as I can see from the CMake scripts in LLVM’s build system, in my situation automatically --gdb-index is added to the linker calls, and my GNU gold also seems to support this. I can see a .gdb_index section in libLLVM.so, which is however practically empty and does not seem to contain any actual index.

Maybe this existing but unusable section is the cause of the problem that GDB startup for debugging Clang is extremely slow, since it recreates the symbols each time at startup. A call to gdb-add-index outputs “file in wrong format” and is not able to add an index.

Strangely, everything worked nicely until I installed Ubuntu 22.04 some weeks ago. Previously, it worked (using Ubuntu 18.04). I was either able to use gdb-add-index in the beginning, or the index was already present in the created binaries. If I remember right, at some point when merging upstream LLVM changes to our project, adding the index was not necessary anymore, since it was already contained in the built binaries, maybe this was introduced by some change in LLVM’s build system.

Any hints would be greatly appreciated.

This problem sounds kind of odd. As a wild stab in the dark, try adding -gdwarf-4 to the compiler options (CMAKE_C_FLAGS and CMAKE_CXX_FLAGS).

@dblaikie might have a clue about how the gdb-index works.

Hmm, yeah - sounds like maybe you’re not getting gnu_pubnames in the .o files, so the linker can’t generate the index from those (or you only end up with some of them/a partial index or the like).

Can you check a randomly selected .o file and objdump -h or llvm-dwarfdump -v and see if it contains the .debug_gnu_pubnames section? (& if possible, check what flags are being passed to the compilation? (I sometimes do this by adding a deliberate error to a file, then the cmake/ninja/etc fails to build that source file it prints out teh command line it was using))

Both clang and GCC that I’ve tested do produce .debug_gnu_pubnames when you pass -gsplit-dwarf (and -g - presumably that’s getting passed though, otherwise you wouldn’t have any debug info at all). You could try adding -ggnu-pubnames explicitly to opt into this, though I don’t have a coherent justification for it.

Thank you very much for your help! It really helped me to quickly identify the problem.

It seems that GNU gold does not understand DWARF 5 data and hence ignores the .debug_gnu_pubnames section. Using -gdwarf-4 I was able to add a correct GDB Index and everything works now.

The reason why I use the gold linker is that using the standard linker ld, building LLVM often crashes due to out-of-memory problems. I did not know that gold is not actively developed anymore. Now, I want to replace it in my builds. I suppose that I could use lld instead? Is this the normal way LLVM developers go?

Another question: I tried to use lld to build LLVM, but now the build crashes because of the following reason:

I have built and installed lld together with the necessary libLLVM.so. Now, later during the build, a call of clang-ast-dump fails, since it depends on libLLVM and should normally be using the libLLVM that is built during the build of LLVM itself, not the one found in the system that is needed by lld.

I was not yet able to fix this using tweaks to PATH and LD_LIBRARY_PATH. It seems the build is confused by the existing libLLVM. Is there a certain way one has to configure this?

I can’t say if the following is a solution for you - but here is what I have learned after working on LLVM and distributing an internal tool chain, it’s not a direct solution to your problem. Just some reflections from similar problems:

  1. I avoid a distribution package that contains llvm development header and libraries to avoid unintended mixing.
  2. I build a bootstrap tool chain with the tools I need to and install them to a non-system location so that I can control which clang/lld is invoked
  3. I don’t build a dynamic LLVM library for my tool chain. Not only can it load the wrong library but it also hurts performance.

Hopefully some of these tips can help you out.

Thanks a lot, yes I think this is a good direction to go, will try this out.

Huh, yes, using lld would help - though I’m surprised DWARFv4/v5 made a difference if there was .debug_gnu_pubnames present (since that wouldn’t have a more recent version on it, and I don’t think the linker needs to read any other section to build the index).

tests

Yeah, seems to work for me:

$ clang++-tot test.cpp -ggnu-pubnames -gdwarf-5 -fuse-ld=gold -Wl,--gdb-index && llvm-dwarfdump-tot -gdb-index a.out
a.out:  file format elf64-x86-64

.gdb_index contents:
  Version = 7

  CU list offset = 0x18, has 1 entries:
    0: Offset = 0x0, Length = 0x37

  Types CU list offset = 0x28, has 0 entries:

  Address area offset = 0x28, has 0 entries:

  Symbol table offset = 0x28, size = 1024, filled slots:
    489: Name offset = 0x11, CU vector offset = 0x0
      String name: main, CU vector index: 0
    754: Name offset = 0x16, CU vector offset = 0x8
      String name: int, CU vector index: 1

  Constant pool offset = 0x2028, has 2 CU vectors:
    0(0x0): 0x30000000 
    1(0x8): 0x90000000 
$ clang++-tot test.cpp -gdwarf-4 -fuse-ld=gold -Wl,--gdb-index && llvm-dwarfdump-tot -gdb-index a.out
a.out:  file format elf64-x86-64

.gdb_index contents:
  Version = 7

  CU list offset = 0x18, has 1 entries:
    0: Offset = 0x0, Length = 0x4b

  Types CU list offset = 0x28, has 0 entries:

  Address area offset = 0x28, has 1 entries:
    Low/High address = [0x650, 0x658) (Size: 0x8), CU id = 0

  Symbol table offset = 0x3c, size = 1024, filled slots:
    489: Name offset = 0x11, CU vector offset = 0x0
      String name: main, CU vector index: 0
    754: Name offset = 0x16, CU vector offset = 0x8
      String name: int, CU vector index: 1

  Constant pool offset = 0x203c, has 2 CU vectors:
    0(0x0): 0x0 
    1(0x8): 0x0 
$ clang++-tot test.cpp -gdwarf-5 -fuse-ld=gold -Wl,--gdb-index && llvm-dwarfdump-tot -gdb-index a.out
a.out:  file format elf64-x86-64

.gdb_index contents:
  Version = 7

  CU list offset = 0x18, has 1 entries:
    0: Offset = 0x0, Length = 0x37

  Types CU list offset = 0x28, has 0 entries:

  Address area offset = 0x28, has 0 entries:

  Symbol table offset = 0x28, size = 0, filled slots:

  Constant pool offset = 0x28, has 0 CU vectors:

So, yeah, without -ggnu-pubnames and with -gdwarf-5 (the default) gold will have trouble building an index.

But generally, use -ggnu-pubnames so that the linker doesn’t have to parse all the DWARF to make the index. But that shouldn’t be possible with Split DWARF anyway - the linker won’t go off and read .dwo files, I don’t think…

Thanks a lot for your detailed help, since I don’t have such a deep knowledge on this topic, this is really helpful for me.

While I could use gold using the above-mentioned flags, I however now migrated successfully to LLD, since I think in the long run this will be a better solution. Since gold is not actively developed anymore, I think over time more problems would occur, and this way my linker will be up to date with the rest of the code.