Problem with large sym files

Hello everyone,

I’ve just recently started to convert our codebase to LLVM 3.0, and also gave a try to debugging with LLDB. And that’s where I ran into a serious wall: lldb is unable to load the sym files for our main executable (not to mention the numerous shared libraries it uses). I’m using LLDB-75 on Mac OS X 10.6.8 on a Core i7 MacBook Pro, have 8 GB memory, the sym format is DWARF-2.

Well, it loads the sym files after 15 mins or so, but then setting a breakpoint also took 10 mins, and step one line during debugging is another 15 mins.

At the end of the process Xcode used 35 GBs of virtual and 6.5 GBs of real memory (according to the Activity Monitor).

Is there anything I can do about it?

GDB handles this situation quite gracefully…

Thanks,

Ákos Somorjai
Developer Support Manager

GRAPHISOFT | Graphisoft Park 1. Budapest 1031 Hungary | +36 1 437-3000 | asomorjai@graphisoft.com

I made some large memory improvements a few days ago which should help with this issues, so the issue is actively being worked on.

One thing you can do that might improve things for debugging with LLDB is to create dSYM files for any binaries that you debug. With the DWARF in the .o files, we end up with a ton of debug info and a large memory footprint, which again, I just submitted a fix for this a few days ago and this hasn’t made it into a build yet.

So try making dSYM files and see if this helps. There is also a compounding issue going on in the Xcode builds where Xcode holds onto references to our symbol files between builds (we have a radar for this) so each time you modify and rebuild and debug again, we still have a copy of the old stuff in memory, so you end up with two copies of the LLVM/Clang executables and debug info. This can only be cured by quitting and restarting Xcode and we do have a fix on the way in the next release.

So for now try:

  • making dSYM files for all executables you want to debug
  • quit Xcode if you end up rebuilding and linking after a few builds

We also have new accelerator tables we are building into the compilers and into the dSYM files that will greatly help with this issue (speed and memory).

Greg Clayton

Hello Greg,

Thanks for the useful information!

Moving to dSYM files really helped, now I can debug ArchiCAD on my machine. The dSYM option was already set for the distribution build, so it was an easy change. Is there any other way besides using dsymutil to produce those files?

I’m also looking forward to the large memory improvements you mentioned; shall I check out the trunk and compile lldb from source?

We are using an external make file system (Perforce’s Jam based), so building in Xcode is not an issue. Though it seems that even in this case Xcode holds onto the loaded symbols files between debugging sessions.

Another question: are the planned improvements making their way into the Xcode 4.2 GM?

I’m very pleased that you are addressing the large application issues.

Best, Akos

Hello Greg,

I’m running into another problem with dsymutil; it can’t extract the dsym information from our main executable (> 200 MB in size) when building the debug version repeatedly. I get the following error message:

error: unable to restore file position to 0x00000c58 for section __DWARF.__debug_info

Any ideas?

Thanks,

Ákos Somorjai
Developer Support Manager

GRAPHISOFT | Graphisoft Park 1. Budapest 1031 Hungary | +36 1 437-3000 | asomorjai@graphisoft.com

Hello greg,

Further info after running dwarfdump on the partially written dSYM file:

dwarfdump /Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/ArchiCAD\ 16.dsym

Looks like dsymutil bailed when it couldn't write the mach-o dSYM file.

Are you invoking dsymutil yourself multiple times? If you have a fat file, you probably want to wait until you have your universal final binary, and run dsymutil on it.

Let me know what your invocation of dsymutil looks like and how/where it is used in your builds (on each individual arch slice, or on the universal result, or only on one arch)?

We only have an x64 architecture; we invoke dsymutil several times for the
frameworks and the plugin bundles, but only once for the main executable,
like this

dsymutil
/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/ArchiCAD\
16.app/Contents/MacOS/ArchiCAD -o
/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/Arc
hiCAD 16.dsym

Is it a problem that the .app has a different name than the executable? Or
the dSYM file is created in a separate folder?

If I do a clean debug build, then the dsym file is created fine. If I do
an incremental build, then the dsym file is created only partially, and I
cannot see the source when debugging the main executable's code.
Breakpoints in the other frameworks work fine; I can step through the
source.
I also tried to create a flat dSYM file, to no avail.
With verbose logging turned on, I got a bit more than 11 000 lines like
this: "address information attributes will be removed because there is no
relocation address"

Thanks,

Ákos Somorjai
Developer Support Manager

GRAPHISOFT | Graphisoft Park 1. Budapest 1031 Hungary | +36 1 437-3000 |
asomorjai@graphisoft.com

We only have an x64 architecture; we invoke dsymutil several times for the
frameworks and the plugin bundles, but only once for the main executable,
like this

dsymutil
/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/ArchiCAD\
16.app/Contents/MacOS/ArchiCAD -o
/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/Arc
hiCAD 16.dsym

I take it you have quotes around your second path with the space in it? You have a backslash before the space for the app, but not for the dSYM above. The propper case for the extension is "dSYM" in case you want it to look like the what Xcode produces, though this likely won't matter on case insensitive file systems, though it would be good to change just in case.

Is it a problem that the .app has a different name than the executable? Or
the dSYM file is created in a separate folder?

No, as long as spotlight can see the folder, you should easily be able to locate your dSYM file. If you want to make sure spotlight can see it, do this:

% mdls "/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/ArchiCAD 16.dsym"

You should see some UUID key/value pairs:

% mdls ~/Documents/src/attach/a.out.dSYM
com_apple_xcode_dsym_paths = (
    "Contents/Resources/DWARF/a.out"
)
com_apple_xcode_dsym_uuids = (
    "FDB4DE40-DC70-3E53-907A-442A6E379283"
)

If you see these, the debuggers will be able to find your dSYM file no matter what the name.

If I do a clean debug build, then the dsym file is created fine. If I do
an incremental build, then the dsym file is created only partially, and I
cannot see the source when debugging the main executable's code.

Sounds like you have a stripping issue. Don't strip your executable except on the install phase of your build. Why? Because in a build like:

- build all .o files
- link executable
- make dSYM
- strip executable

you now have an executable that is newer than your dSYM file. If you use a Makefile, it will notice that your dSYM file depends on your executable, and it will try to rebuild your dSYM file which will now use a stripped executable which won't have any of the needed debug map entries that dsymutil uses to link the dSYM files and all debug info will be lost.

Breakpoints in the other frameworks work fine; I can step through the
source.
I also tried to create a flat dSYM file, to no avail.
With verbose logging turned on, I got a bit more than 11 000 lines like
this: "address information attributes will be removed because there is no
relocation address"

That is fine. We dead strip the DWARF and remove functions, types and other info that didn't make it into the final linked executable. If you indeed are stripping your executable, this will happen for just about everything which you don't want. So try to not do any stripping as part of your normal compile/link/debug/fix cycles.

Does that make sense and/or help?

Greg,

Another issue is dsymutil's speed; it takes about 10 minutes to run it on
our main executable, which is more than unbearable. Is their anything we
can do to reduce that time?

Thanks,

Akos

We only have an x64 architecture; we invoke dsymutil several times for
the
frameworks and the plugin bundles, but only once for the main
executable,
like this

dsymutil
/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/ArchiCAD\
16.app/Contents/MacOS/ArchiCAD -o

/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/A
rc
hiCAD 16.dsym

I take it you have quotes around your second path with the space in it?
You have a backslash before the space for the app, but not for the dSYM
above. The propper case for the extension is "dSYM" in case you want it
to look like the what Xcode produces, though this likely won't matter on
case insensitive file systems, though it would be good to change just in
case.

Yeah, you're right, I do have quotes around the paths, I was just typing
the path directly instead of copying it from the Terminal; sorry.

Is it a problem that the .app has a different name than the executable?
Or
the dSYM file is created in a separate folder?

No, as long as spotlight can see the folder, you should easily be able to
locate your dSYM file. If you want to make sure spotlight can see it, do
this:

% mdls
"/Volumes/Work/D-084/Bin.Mactel/Programs_GCC4_2_64_LLVM.dev/Support/MAPs/A
rchiCAD 16.dsym"

You should see some UUID key/value pairs:

% mdls ~/Documents/src/attach/a.out.dSYM
com_apple_xcode_dsym_paths = (
   "Contents/Resources/DWARF/a.out"
)
com_apple_xcode_dsym_uuids = (
   "FDB4DE40-DC70-3E53-907A-442A6E379283"
)

If you see these, the debuggers will be able to find your dSYM file no
matter what the name.

I see this kind of output.

If I do a clean debug build, then the dsym file is created fine. If I do
an incremental build, then the dsym file is created only partially, and
I
cannot see the source when debugging the main executable's code.

Sounds like you have a stripping issue. Don't strip your executable
except on the install phase of your build. Why? Because in a build like:

- build all .o files
- link executable
- make dSYM
- strip executable

you now have an executable that is newer than your dSYM file. If you use
a Makefile, it will notice that your dSYM file depends on your
executable, and it will try to rebuild your dSYM file which will now use
a stripped executable which won't have any of the needed debug map
entries that dsymutil uses to link the dSYM files and all debug info will
be lost.

Good point! A stripping step was included after creating the dsym file;
the reason is that we used this mechanism only for the final release
builds. I had to turn it on to be able to debug with LLDB, but I missed
the stripping.

Breakpoints in the other frameworks work fine; I can step through the
source.
I also tried to create a flat dSYM file, to no avail.
With verbose logging turned on, I got a bit more than 11 000 lines like
this: "address information attributes will be removed because there is
no
relocation address"

That is fine. We dead strip the DWARF and remove functions, types and
other info that didn't make it into the final linked executable. If you
indeed are stripping your executable, this will happen for just about
everything which you don't want. So try to not do any stripping as part
of your normal compile/link/debug/fix cycles.

Does that make sense and/or help?

Thanks, yes.
The problem is that we're back at square number one, I still get an error
message from dsymutil while extracting the symbols from the main
executable:

  error: unable to restore file position to 0x00000c58 for section
__DWARF.__debug_info

Any other ideas?

Thanks a lot,

Akos

If you are linking a single architecture, unfortunately there isn't much you can do. Linking DWARF is a serialized process where we take DWARF from a bunch of .o files and then make a single out file. All DWARF sections have to be appended to one another and they have interdependencies, so even though dsymutil does use multi-threading to parse 8 .o files ahead, it doesn't help in the long run. clang and llvm-gcc make debug info that is around 4 times bigger that gcc, so this probably makes things worse for you guys.

How big is your dSYM file?

It's at least 39 MB-s; I can't tell exactly because dsymutil bails out (the original problem).
I have a feeling that we are going in circles :slight_smile:

Best, Ákos

2011.09.26. dátummal, 18:57 idõpontban "Greg Clayton" <gclayton@apple.com> írta:

The problem is that we're back at square number one, I still get an error
message from dsymutil while extracting the symbols from the main
executable:

  error: unable to restore file position to 0x00000c58 for section
__DWARF.__debug_info

Any other ideas?

Does this happen every time? Or just some of the time? Mach files have a 4 GB limit of the section size, I hope you aren't getting close to that limit?

dwarfdump has a really goofy option I added to track such issues where you can display file data for your executable and all .o files. You run this on your executable (not your dSYM):

% dwarfdump --show-children --file-stats a.out
File symtab strtab code data DWARF debug STABS debug other file
--------- ----------------- ----------------- ----------------- ----------------- ----------------- ----------------- ---------------------------------------
      9104 192 2.11% 166 1.82% 1187 13.04% 0 0.00% 202 2.22% 7357 80.81% a.out (i386)
      4040 84 2.08% 73 1.81% 915 22.65% 917 22.70% 0 0.00% 2051 50.77% /tmp/main.o (i386)
========= ================= ================= ================= ================= ================= ================= =======================================
     13144 276 2.10% 239 1.82% 2102 15.99% 917 6.98% 202 1.54% 9408 71.58%

It basically looks into your main executable and finds all of the .o files from the debug map and prints out stats at the end. You should run this on your main executable and see what the totals come out to.

If you are able to get a dSYM file, you can view its section sizes:

% dwarfdump -R /tmp/large.dSYM
...
__debug_abbrev __DWARF 011ac000 00002278 00379000 00000000 00000000 00000000 00000000 00000000 00000000 8.62K 0.00%
__debug_aranges __DWARF 011ae278 0004b098 0037b278 00000000 00000000 00000000 00000000 00000000 00000000 300.15K 0.12%
__debug_frame __DWARF 011f9310 0014fd24 003c6310 00000000 00000000 00000000 00000000 00000000 00000000 1.31M 0.52%
__debug_info __DWARF 01349034 0bc4a338 00516034 00000000 00000000 00000000 00000000 00000000 00000000 188.29M 75.15%
__debug_inlined __DWARF 0cf9336c 00229b4f 0c16036c 00000000 00000000 00000000 00000000 00000000 00000000 2.16M 0.86%
__debug_line __DWARF 0d1bcebb 00822243 0c389ebb 00000000 00000000 00000000 00000000 00000000 00000000 8.13M 3.25%
__debug_loc __DWARF 0d9df0fe 0068d8f4 0cbac0fe 00000000 00000000 00000000 00000000 00000000 00000000 6.55M 2.62%
__debug_pubnames __DWARF 0e06c9f2 002b42e1 0d2399f2 00000000 00000000 00000000 00000000 00000000 00000000 2.70M 1.08%
__debug_pubtypes __DWARF 0e320cd3 00eec229 0d4edcd3 00000000 00000000 00000000 00000000 00000000 00000000 14.92M 5.96%
__debug_ranges __DWARF 0f20cefc 00484c78 0e3d9efc 00000000 00000000 00000000 00000000 00000000 00000000 4.52M 1.80%
__debug_str __DWARF 0f691b74 00cee6d0 0e85eb74 00000000 00000000 00000000 00000000 00000000 00000000 12.93M 5.16%

Let me know how big things are on your end when you get the chance.

Greg Clayton

Ok, run the commands on your main executable that looks at all .o files and let me know what sizes it comes up with. I am guessing you probably can't file a bug and attach your executable and .o files? (no source would be needed). If I can reproduce the issue, I can fix it.

Greg Clayton

Here are the 3 outputs:
1. dwarfdump --file-stats
2. dsymutil --verbose
3. dwarfdump -R

I couldn't generate fully generate the dSYM file, the usual error message
comes up. Still, I logged the dwarfdump -R output.

From the numbers I think we're way over that 4 GB limit...

Best,

Ákos

Dumps.zip (188 KB)

Just a faint idea, but would it be possible to trick the linker into
producing the output file (so that we wouldn't have a post-process)? It
parses the object files anyway.

Also, I think compiling dsymutil with llvm would give it a slight speed
boost (8-10% on our code base in general, hence my guess).

Best,

Ákos

One more thing: the .o files for the main executable only are nearly 8 GBs
in size; am I allowed to upload such amount to radar?

Best,

Ákos

No, I know why things are failing: we are hitting the limits of the mach-o file format. There isn't anything we can do other than try the:

-flimit-debuginfo

compiler flag. Add this to your builds (if you are using clang or llvm-gcc) and let me know how things work??

Greg

Make that

-flimit-debug-info

Here are the new dumps; this time dsymutil ran without any error. The dSYM
file is 2.23 GBs though...

Ákos

Dumps ((limit-debug-info).zip (187 KB)