David, (and everyone else)
I am forced to do some maintenance work on a fairly old LLVM branch
(likely based on release 3.1) that among other issues has a major problem
with memory leak somewhere around DWARF debug support.
In fact customer is unable to build with -g at all - simply running out of
memory on their project...
I seem to remember that there has been a major fix related to it, but
finding it hard to pinpoint. So the appeal to the communal memory - do you
remember if there was a single fix (since the middle of 2012 or so) that
would have addressed such issue?
I know the code was drastically change since - I have seen this for
instance
llvm.org/viewvc/llvm-project/llvm/trunk/lib/MC/ELFObjectWriter.cpp?view=log&
pathrev=205990
...but not sure this is what I am looking for, and whether there is
something less drastic than I can do to plug that leak.
Any clue would be very much appreciated.
Thanks.
Sergei
Are you sure it's a leak, not just a large memory footprint?
A lot of things have changed. Heck, even the bugs that we've fixed
might not have existed two years ago (I've certainly created (and
fixed) my share of bugs in debug info).
Best you start with a leak tracking tool and at least be able to point
to the major leak culprit (eg: "90% of the leaked memory was allocated
at <stack trace>")
David,
Thanks for the quick response...
No, at this point I am just getting into the issue... I assume it is a leak, but no clear proof yet. I was hoping it was an obvious thing since I recall a discussion about it a while ago... but maybe I am just confused.
Was your work for compressing DWARF data motivated by a certain inefficiency in debug info representation? Did it result in significant savings?
Thanks.
Sergei
David,
Thanks for the quick response...
No, at this point I am just getting into the issue... I assume it is a leak, but no clear proof yet. I was hoping it was an obvious thing since I recall a discussion about it a while ago... but maybe I am just confused.
Was your work for compressing DWARF data motivated by a certain inefficiency in debug info representation? Did it result in significant savings?
The compression isn't related to the IR level and some data is here on
that compressing:
https://gcc.gnu.org/wiki/DebugFission
The internal representation and code have change significantly since
then and there's no real way to help much. It may be a memory leak,
but I don't recall one that bad back then. How is your customer
compiling? LTO? Something else?
-eric
Thanks Eric,
They are doing LTO build but with some custom modifications (think a library at a time as opposed to a whole program). I must admit, it is a rather large application as well, so as expected, any inefficiencies are multiplied greatly.
From little that I have seen so far, it looks like debug metadata for an IR object linger behind once the object itself is eliminated (optimized). Rings a bell?
Thanks.
Sergei
David,
Thanks for the quick response...
No, at this point I am just getting into the issue... I assume it is a leak, but no clear proof yet. I was hoping it was an obvious thing since I recall a discussion about it a while ago... but maybe I am just confused.
There have been some memory leaks fixed - the discussion you might be
recalling is the frontend leaks using DIBuilder. Addressed by changes
like this: http://llvm.org/viewvc/llvm-project?rev=208055&view=rev
But I don't think that was ever a crippling bug, so far as I recall.
We only found this when using leak detectors - not when compiling vast
swathes of code, etc.
Was your work for compressing DWARF data motivated by a certain inefficiency in debug info representation? Did it result in significant savings?
compressing DWARF has little to do with memory leaks, etc - it's
related to the nature of DWARF on disk, that's where the compression
occurs (in the object (and dwo) files themselves). Yes, it does have
substantial savings, though I don't have the numbers at hand right
now. (string sections especially are compressed well).
- David
Thanks Eric,
They are doing LTO build but with some custom modifications (think a library at a time as opposed to a whole program). I must admit, it is a rather large application as well, so as expected, any inefficiencies are multiplied greatly.
From little that I have seen so far, it looks like debug metadata for an IR object linger behind once the object itself is eliminated (optimized). Rings a bell?
They're going to run into all sorts of problems. They're doomed if
it's a large C++ application (or really even if it's a C based
application).
They won't get this to link likely with LTO + debug. It's unlikely to
be a memory leak. It's just an explosion of debug info because of our
lack of uniquing for type information. You can see my EuroLLVM 2013
talk where I describe this and Manman's patches to help fix the issue
over the last couple of years.
-eric
David,
This certainly looks relevant - even if this is not the ultimate cause of the issue. I will try it. Thank you.
...it also sounds very reasonable what Eric is describing:
" It's just an explosion of debug info because of our lack of uniquing for type information."
It looks more like the version of tools I am dealing with was never ready to handle LTO + debug efficiently, and it would be more like back porting a major new feature to an older codebase
This sure looks like a fun project... Maybe my time will be better used to convince the customer to upgrade 
Thanks again for helping.
Sergei
Eric,
Let me clarify it a bit... without type uniqueing for LTO + debug will I have a highly inefficient IR representation or incorrect debug info? If debug info for LTO is known to be non-useful or ambiguous or flat wrong - there is no point in fixing its emission... or will it still be practical and if I manage to improve it somewhat the customer will still have some value-add by using it?
Thanks...
Sergei
Eric,
Let me clarify it a bit... without type uniqueing for LTO + debug will I have a highly inefficient IR representation or incorrect debug info? If debug info for LTO is known to be non-useful or ambiguous or flat wrong - there is no point in fixing its emission... or will it still be practical and if I manage to improve it somewhat the customer will still have some value-add by using it?
In current ToT it should be correct. In 3.1? It'll be, at best,
inefficient - likely a big pile of bugs to fix to get that to work
will also fix any problems that show up.
That said, it's a really bad idea to try to do it.
-eric