How to strip all unused debugging metadata?

When I generate debug information for a source file that has a number of static functions that are unused, all of the debugging metadata that I generated for them during initial compilation remains even after the source function definitions have been stripped out of the IR. (e.g. in the MD for DW_TAG_compile_unit's list of subprograms, each of those functions' info is still in the list, and so forth). How can I remove that unused stuff? (It continues on to generated object files, which is extra undesirable when there is a lot of this.)

I would have expected that the strip-dead-debug-info pass would take care of this, but it doesn't make a difference--looking at its implementation, it just iterates over llvm.dbg.gv and llvm.dbg.sp and removes entries from there that are unused. In my case, DIBuilder doesn't seem to be declaring llvm.dbg.sp in the first place--is the a symptom of a problem?

Thanks,
-matt

When I generate debug information for a source file that has a number of static functions that are unused, all of the debugging metadata that I generated for them during initial compilation remains even after the source function definitions have been stripped out of the IR. (e.g. in the MD for DW_TAG_compile_unit's list of subprograms, each of those functions' info is still in the list, and so forth). How can I remove that unused stuff? (It continues on to generated object files, which is extra undesirable when there is a lot of this.)

In theory, every metadata that is not attached to a real value (or a
named metadata) is stripped out. I think even if they're not in gv/sp
arrays, but I'm not sure.

Are you sure there isn't any path to a real value from the dangling
metadata? Do you create named metadata?

I would have expected that the strip-dead-debug-info pass would take care of this, but it doesn't make a difference--looking at its implementation, it just iterates over llvm.dbg.gv and llvm.dbg.sp and removes entries from there that are unused. In my case, DIBuilder doesn't seem to be declaring llvm.dbg.sp in the first place--is the a symptom of a problem?

I'd assume that has nothing to do with it. Not sure it's a strict
requirement to produce the gv/sp arrays. Have you tried producing them
and linking all your metadata in it to see if the dangling ones get
removed?

I think that the root problem is that there is a path to the metadata (from llvm.dbg.cu to the metadata for the DW_TAG_compile_unit to the list of subprograms to the individual DW_TAG_subprogram metadata), but that in turn there’s no longer any need for the DW_TAG_subprogram metadata for all of the functions that were stripped out.

Perhaps the answer is that I just need to write the code that traverses the metadata and removes the subprogram metadata from the list of subprograms, if the corresponding function definition has been stripped from the module. I’d have guessed that the DIBuilder infrastructure would have ended up doing that automagically for me, but maybe that was an incorrect assumption?

Thanks,
-matt

You are right in assuming that unused metadata should be cleaned when
functions/variables get removed, but whether LLVM will do that or not
is another matter.

Someone might know better (it's been 6 months last time I've looked at
debug metadata), but I don't think there's still isn't a clear way to
handle debug metadata (or any metadata for that matter), and people
end up doing locally.

DIBuilder is not the right candidate to be used by optimisations,
though. As IRBuilder, it's just a translation framework. I'm not sure
there is any. Maybe, if whomever removed the function also removed
just the subprogram tag, the rest would be removed from the output.

cheers,
--renato

Yep. There's nothing that will remove them. Though, theoretically at least, if there's no function to get a range from we could avoid emitting the subprogram die.

-eric