Proposal: type uniquing of debug info for LTO

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
            !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Other choices are:
> Keep a map in DwarfDebug
    Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
    We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.
Other suggestions are welcome.

a or b please.

Transition from current DI Metadata:
To have a smooth transition, I will add a flag "-gtype-hashing" for the type uniquing work and turn it on by default when we are ready.

I'd prefer just make the change to have the front end emit the "hash"
(it's not really a hash, it's just a string).

-----------------------------
Patches:
Expect the following patches:
1> add flag -gtype-hashing
2> add hash field to DI types
3> modify DIBuilder to use hash instead of MD reference
4> related to issue 3

These can all be a single patch since it shouldn't be very large if we
go with a) above. If we go with b) then the MDNode work should be done
in isolation first and then the debug info on top of it.

5> backend change (in DwarfDebug|CompileUnit) to support types shared among compile units
    requires gdwarf-2 gdwarf-3 gdwarf-4 support for issues related to ref_addr

#5 can and should be done before the rest of them.

All changes should be local to debug info classes except patch #4.

What's patch #4?

-eric

+1 to roughly everything Eric said.

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
            !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

2> Make sure the streamers (bit code or ll writer) will output all required MDNodes.
For MDNodes that are only used through hash values, we have to make sure they are dumped.

should read "to make sure they are not dumped" - just to clarify for
other readers: unnamed metadata is kept alive by being reference
(directly or indirectly) by named metadata. If we break the reference
by using a string identifier pair, the type metadata wouldn't exist as
it would be unreferenced in the IR sense.

The solution currently is to add these MDNodes to retained types of the compile unit.

So we can just keep real metadata (rather than string/hash/thingy)
references to these nodes in a flat list to ensure they don't get
dropped.

Thanks David for the suggestion.

3> To actually access the MDNode referenced via the hash value, we need to perform a lookup from the hash value to find the corresponding MDNode.
The questions are where to store this map and how to keep it up-to-date when a MDNode is replaced.
Each hash value should have at most one corresponding definition and at most one declaration.

We need the map when building DI for verification purposes, when writing the ll file to dump the comments, at llvm linking time to update the map,
at backend when generating the actual dwarf.

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As we've already discussed off list, both Eric & I believe this isn't
an appropriate layering & that LLVM's core IR metadata APIs cannot
reasonably have special hooks for this debug info feature.

Other choices are:
> Keep a map in DwarfDebug
    Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.

I'm not sure what you mean by "in sync with the MDNodeSet".

> Generalize MDNode to be aware of hash (David can provide more details)

My suggestion here, which I think is the only reason to really discuss
this broadly (because it would be an LLVM metadata API feature) is
roughly as follows:

An LLVM IR feature, completely independent from debug info that would
work something like:
a) a blessed named MD node root that would contain string+node pairs
(alternating elements probably, for space conservation)
b) a means to add an operand to an MD node with an "indirect"
reference - this would cause the string to be used as the value (with
some flag that it's a 'special' string) but any access of the node
through the operand list would return the underlying MDNode from the
list (lookup via a map in LLVMContext, essentially).

I'm not at all deeply familiar with the LLVM IR metadata APIs - so
this is a strawman at best but an idea about how we could provide the
functionality that debug info could benefit from in an isolated, not
debug-specific, feature. If we're going to touch metadata APIs I think
something like this should be how we do it.

If anyone more familiar with these APIs & the direction we should and
shouldn't take them could clarify if this is
possible/reasonable/acceptable/the right tradeoff compared to (a) that
would be much appreciated. That's what I'd like to get out of this
thread.

(for some more context: (a) will be isolated to debug info code, but
will require (in my naive estimation) a fair amount of mechanical work
to perform manual lookups on types whenever we access them - at some
point we may have a patch that demonstrates just how much work that
is, but until then we can only make vague statements, and by the time
we do this by any substantial amount it will've been a fair bit of
work only to throw it away in favor of (b) if it looks too unweildy.)

> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
    We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.

As discussed offline, Eric & I really don't think this is acceptable.
Again, those who own/maintain/have opinions on LLVM metadata APIs can
weigh in if you think we're off-base in that assessment.

Other suggestions are welcome.

---------------------------
Preliminary Results:
SPEC xalancbmk: ld time down from 20+ mins to 4+ mins on my Mac Air

-----------------------------
Transition from current DI Metadata:
To have a smooth transition, I will add a flag "-gtype-hashing" for the type uniquing work and turn it on by default when we are ready.

-----------------------------
Patches:
Expect the following patches:
1> add flag -gtype-hashing
2> add hash field to DI types

If we can avoid changing the schema of DI types it's really best that
way - especially adding (or removing) fields, moreso while that's
under a flag. It'll be a real pain to update test cases & diverge the
schema making it harder to maintain good test coverage.

Essentially what we want to do is a single pass (as Eric said) - where
before there were MDNodes, there would now be strings. That means
updating DIBuilder and DwarfDebug, etc is one go, basically.

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
           !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and remove the other path.

For internal c++ types, initially they will follow the path of using MD references without a hash.

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
   Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
   We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler, and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can solve the problem.

More details for option b from David

< The alternative I have in mind is a more complete version of what
< you're proposing - a full MD feature, not an MD feature that's just
< barely enough to support the needs of debug info. What we could do is
< allow the insertion of these MDHash things you spoke about but take it
< a step further and have MDNode::getOperand walk through the hash &
< give the value (in this way, DebugInfo wouldn't have to change at all
< to handle hashes - if the Metadata APIs are going to be aware of the
< hashes anyway, they might as well provide this convenience
< functionality) the metadata feature would also have to have some
< blessed top-level named metadata that would have a list of hash+MDNode
< to keep those MDNodes alive (so you wouldn't have to stuff all the
< types in the retained types list - metadata would provide the full
< support, not just half of it).

Transition from current DI Metadata:
To have a smooth transition, I will add a flag "-gtype-hashing" for the type uniquing work and turn it on by default when we are ready.

I'd prefer just make the change to have the front end emit the "hash"
(it's not really a hash, it's just a string).

Are you saying no transition period? A single patch to have correct handling of "hash" and to update all existing testing cases?

-----------------------------
Patches:
Expect the following patches:
1> add flag -gtype-hashing
2> add hash field to DI types
3> modify DIBuilder to use hash instead of MD reference
4> related to issue 3

These can all be a single patch since it shouldn't be very large if we
go with a) above. If we go with b) then the MDNode work should be done
in isolation first and then the debug info on top of it.

What is wrong with smaller patches?
My estimation for all the above with a) is about 30K + testing cases.

5> backend change (in DwarfDebug|CompileUnit) to support types shared among compile units
   requires gdwarf-2 gdwarf-3 gdwarf-4 support for issues related to ref_addr

#5 can and should be done before the rest of them.

I prefer to submit patches according to the flow of the compiler, starting from the frontend, then IR, then backend.
The testing cases will be added for front end, llvm-link and backend.
Any reason why #5 should be done first?

All changes should be local to debug info classes except patch #4.

What's patch #4?

Patch #4 above: related to issue 3 (changes corresponding to how to solve issue #3)

-Manman

+1 to roughly everything Eric said.

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
           !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

2> Make sure the streamers (bit code or ll writer) will output all required MDNodes.
For MDNodes that are only used through hash values, we have to make sure they are dumped.

should read "to make sure they are not dumped"

I meant to say make sure the MDNodes are written out to bc or ll file.

- just to clarify for
other readers: unnamed metadata is kept alive by being reference
(directly or indirectly) by named metadata. If we break the reference
by using a string identifier pair, the type metadata wouldn't exist as
it would be unreferenced in the IR sense.

Thanks for the clarification.

The solution currently is to add these MDNodes to retained types of the compile unit.

So we can just keep real metadata (rather than string/hash/thingy)
references to these nodes in a flat list to ensure they don't get
dropped.

Thanks David for the suggestion.

3> To actually access the MDNode referenced via the hash value, we need to perform a lookup from the hash value to find the corresponding MDNode.
The questions are where to store this map and how to keep it up-to-date when a MDNode is replaced.
Each hash value should have at most one corresponding definition and at most one declaration.

We need the map when building DI for verification purposes, when writing the ll file to dump the comments, at llvm linking time to update the map,
at backend when generating the actual dwarf.

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As we've already discussed off list, both Eric & I believe this isn't
an appropriate layering & that LLVM's core IR metadata APIs cannot
reasonably have special hooks for this debug info feature.

Understand, that is why it is an option and comments from other people are welcome.

Other choices are:
> Keep a map in DwarfDebug
   Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.

I'm not sure what you mean by "in sync with the MDNodeSet".

When a node is created or RAUW, the map has to be updated.
MDNodeSet keeps all live MDNodes and the map keeps part of MDNodeSet.

> Generalize MDNode to be aware of hash (David can provide more details)

My suggestion here, which I think is the only reason to really discuss
this broadly (because it would be an LLVM metadata API feature) is
roughly as follows:

An LLVM IR feature, completely independent from debug info that would
work something like:
a) a blessed named MD node root that would contain string+node pairs
(alternating elements probably, for space conservation)
b) a means to add an operand to an MD node with an "indirect"
reference - this would cause the string to be used as the value (with
some flag that it's a 'special' string) but any access of the node
through the operand list would return the underlying MDNode from the
list (lookup via a map in LLVMContext, essentially).

I'm not at all deeply familiar with the LLVM IR metadata APIs - so
this is a strawman at best but an idea about how we could provide the
functionality that debug info could benefit from in an isolated, not
debug-specific, feature. If we're going to touch metadata APIs I think
something like this should be how we do it.

If anyone more familiar with these APIs & the direction we should and
shouldn't take them could clarify if this is
possible/reasonable/acceptable/the right tradeoff compared to (a) that
would be much appreciated. That's what I'd like to get out of this
thread.

(for some more context: (a) will be isolated to debug info code, but
will require (in my naive estimation) a fair amount of mechanical work
to perform manual lookups on types whenever we access them - at some
point we may have a patch that demonstrates just how much work that
is, but until then we can only make vague statements, and by the time
we do this by any substantial amount it will've been a fair bit of
work only to throw it away in favor of (b) if it looks too unweildy.)

> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
   We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.

As discussed offline, Eric & I really don't think this is acceptable.
Again, those who own/maintain/have opinions on LLVM metadata APIs can
weigh in if you think we're off-base in that assessment.

It is an option to go in the other direction, either generalize MDNode or have a special class for DINode.

- Manman

+1 to roughly everything Eric said.

The intent of this proposal is to speedup compilation of "-flto -g" for c++ programs.
This is based on discussions with Adrian, David and Eric.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
           !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base", metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

2> Make sure the streamers (bit code or ll writer) will output all required MDNodes.
For MDNodes that are only used through hash values, we have to make sure they are dumped.

should read "to make sure they are not dumped"

I meant to say make sure the MDNodes are written out to bc or ll file.

- just to clarify for
other readers: unnamed metadata is kept alive by being reference
(directly or indirectly) by named metadata. If we break the reference
by using a string identifier pair, the type metadata wouldn't exist as
it would be unreferenced in the IR sense.

Thanks for the clarification.

The solution currently is to add these MDNodes to retained types of the compile unit.

So we can just keep real metadata (rather than string/hash/thingy)
references to these nodes in a flat list to ensure they don't get
dropped.

Thanks David for the suggestion.

3> To actually access the MDNode referenced via the hash value, we need to perform a lookup from the hash value to find the corresponding MDNode.
The questions are where to store this map and how to keep it up-to-date when a MDNode is replaced.
Each hash value should have at most one corresponding definition and at most one declaration.

We need the map when building DI for verification purposes, when writing the ll file to dump the comments, at llvm linking time to update the map,
at backend when generating the actual dwarf.

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As we've already discussed off list, both Eric & I believe this isn't
an appropriate layering & that LLVM's core IR metadata APIs cannot
reasonably have special hooks for this debug info feature.

Understand, that is why it is an option and comments from other people are welcome.

Other choices are:
> Keep a map in DwarfDebug
   Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.

I'm not sure what you mean by "in sync with the MDNodeSet".

When a node is created or RAUW, the map has to be updated.
MDNodeSet keeps all live MDNodes and the map keeps part of MDNodeSet.

Presumably the map would work with ValueHandles? I don't know IR well
enough & just speculating, really, but I don't see any reason that the
map should have any relation to the MDNodeSet specifically.

> Generalize MDNode to be aware of hash (David can provide more details)

My suggestion here, which I think is the only reason to really discuss
this broadly (because it would be an LLVM metadata API feature) is
roughly as follows:

An LLVM IR feature, completely independent from debug info that would
work something like:
a) a blessed named MD node root that would contain string+node pairs
(alternating elements probably, for space conservation)
b) a means to add an operand to an MD node with an "indirect"
reference - this would cause the string to be used as the value (with
some flag that it's a 'special' string) but any access of the node
through the operand list would return the underlying MDNode from the
list (lookup via a map in LLVMContext, essentially).

I'm not at all deeply familiar with the LLVM IR metadata APIs - so
this is a strawman at best but an idea about how we could provide the
functionality that debug info could benefit from in an isolated, not
debug-specific, feature. If we're going to touch metadata APIs I think
something like this should be how we do it.

If anyone more familiar with these APIs & the direction we should and
shouldn't take them could clarify if this is
possible/reasonable/acceptable/the right tradeoff compared to (a) that
would be much appreciated. That's what I'd like to get out of this
thread.

(for some more context: (a) will be isolated to debug info code, but
will require (in my naive estimation) a fair amount of mechanical work
to perform manual lookups on types whenever we access them - at some
point we may have a patch that demonstrates just how much work that
is, but until then we can only make vague statements, and by the time
we do this by any substantial amount it will've been a fair bit of
work only to throw it away in favor of (b) if it looks too unweildy.)

> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
   We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.

As discussed offline, Eric & I really don't think this is acceptable.
Again, those who own/maintain/have opinions on LLVM metadata APIs can
weigh in if you think we're off-base in that assessment.

It is an option to go in the other direction, either generalize MDNode or have a special class for DINode.

The concern is that having a special case such as what you've
described for DINode (the naming isn't important, the concept is) is
that it's not a generally usable/complete feature to add to the
metadata APIs. If we can generalize a useful standalone feature to add
to metadata support (I believe what I've described in (b) is such a
thing) then that's one thing - but to have half a feature in there is
a bit uncomfortable.

- David

Hi Manman,

The intent of this proposal is to speedup compilation of “-flto -g” for c++ programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.


Problem:
A single class can be used in multiple source files and the DI (Debug Info) class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with -flto -g.
With a preliminary implementation of type uniquing, the memory usage will be down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !“Base”, i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !“a”, i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !“Base”, metadata !“Base”, metadata !“”, i32 1, metadata !17, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 – !14 – !15 – !12
!12 – !14 – !16 – !12

These cycles make it hard to unique the same struct used in two bc files.


How to fix:

We attach a hash value to types to help type uniquing and we also replace references to types with their hash values.
For the above struct “Base”, we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !“Base”, i32 1, i64 32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32 915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !“a”, i32 2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line 2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !“Base”, metadata !“Base”, metadata !“”, i32 1, metadata !10, i1 false, i1 false, i32 0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ; [ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the “hash” that’s just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that’s a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and remove the other path.

For internal c++ types, initially they will follow the path of using MD references without a hash.

My current implementation is to add a few static member functions in MDNode to profile DI nodes differently.

  • /// If the array of Vals is for debug info, profile it specially and return true.

  • /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.

  • static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID &ID);

  • /// If the MDNode is for debug info, profile it specially and return true.

  • /// If the DI node has a hash value, generate the profile using only the hash value and the declaration flag.

  • static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

  • /// Given a hash value and a flag, generate the profile for later lookup.

  • static bool profileDebugInfoNode(unsigned Hash, bool Declaration, FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {

  • if (profileDebugInfoNode(this, ID))
  • return;

There are other examples of these in MDNode for handling of specific metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I’ve said many times in email, I don’t think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:

Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync with MDNodeSet.
Generalize MDNode to be aware of hash (David can provide more details)
Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll reader|writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler, and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

Thanks,
Manman

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
          !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
  Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
  We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
         !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new MDNodes.
So any other opinion on putting it in LLVMContext other than it being central?

Thanks,
Manman

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
         !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new MDNodes.

I don't understand what you mean by this. IR linking shouldn't need to
do anything debug-info-specific, it should just be the normal IR
approach.

The declaration-v-definition resolution can be done at codegen-time.
We'll have to walk all the lists of retained types anyway to build the
map, so we can do declaration-v-definition (keep definitions over
declarations when we see both) at that point.

Hi Manman,

The intent of this proposal is to speedup compilation of “-flto -g” for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.


Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !“Base”, i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !“a”, i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !“Base”,
metadata !“Base”, metadata !“”, i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 – !14 – !15 – !12
!12 – !14 – !16 – !12

These cycles make it hard to unique the same struct used in two bc files.


How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct “Base”, we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !“Base”, i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !“a”, i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !“Base”,
metadata !“Base”, metadata !“”, i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the “hash” that’s just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that’s a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.

  • /// If the array of Vals is for debug info, profile it specially and
    return true.

  • /// If the DI node has a hash value, generate the profile using only the
    hash value and the declaration flag.

  • static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
    &ID);

  • /// If the MDNode is for debug info, profile it specially and return true.

  • /// If the DI node has a hash value, generate the profile using only the
    hash value and the declaration flag.

  • static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

  • /// Given a hash value and a flag, generate the profile for later lookup.

  • static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
    FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {

  • if (profileDebugInfoNode(this, ID))
  • return;

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I’ve said many times in email, I don’t think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:

Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
Generalize MDNode to be aware of hash (David can provide more details)
Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don’t think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new MDNodes.

I don’t understand what you mean by this. IR linking shouldn’t need to
do anything debug-info-specific, it should just be the normal IR
approach.

The declaration-v-definition resolution can be done at codegen-time.
We’ll have to walk all the lists of retained types anyway to build the
map, so we can do declaration-v-definition (keep definitions over
declarations when we see both) at that point.

Yes, you are right that at codegen-time, we can generate the map from the lists
of retained types.

But dumping the linked ll file requires the map when outputting comments of the MDNode :]

Thanks,
Manman

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
        !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new
MDNodes.

I don't understand what you mean by this. IR linking shouldn't need to
do anything debug-info-specific, it should just be the normal IR
approach.

The declaration-v-definition resolution can be done at codegen-time.
We'll have to walk all the lists of retained types anyway to build the
map, so we can do declaration-v-definition (keep definitions over
declarations when we see both) at that point.

Yes, you are right that at codegen-time, we can generate the map from the
lists
of retained types.

But dumping the linked ll file requires the map when outputting comments of
the MDNode :]

Depending on which things we print out, but yes, in some cases
(derived types) we do print out the type referenced. I assume the
AsmPrinter can build such a map, then. (in fact, with a few clients
like this, it might be nice to build a bit of an abstraction around it
rather than just using a raw map - something that has a ctor (or I
suppose it could be a factory function) that reads in the right
metadata, walks the compile units, etc, and builds the mapping)

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
       !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new
MDNodes.

I don't understand what you mean by this. IR linking shouldn't need to
do anything debug-info-specific, it should just be the normal IR
approach.

The declaration-v-definition resolution can be done at codegen-time.
We'll have to walk all the lists of retained types anyway to build the
map, so we can do declaration-v-definition (keep definitions over
declarations when we see both) at that point.

Yes, you are right that at codegen-time, we can generate the map from the
lists
of retained types.

But dumping the linked ll file requires the map when outputting comments of
the MDNode :]

Depending on which things we print out, but yes, in some cases
(derived types) we do print out the type referenced. I assume the
AsmPrinter can build such a map, then. (in fact, with a few clients
like this, it might be nice to build a bit of an abstraction around it
rather than just using a raw map - something that has a ctor (or I
suppose it could be a factory function) that reads in the right
metadata, walks the compile units, etc, and builds the mapping)

A call to MDNode::dump() will require the map to print out the comments.
To continue supporting dump(), we need to have the map at almost all times.

Thanks,
Manman

Hi Manman,

The intent of this proposal is to speedup compilation of "-flto -g" for c++
programs.
This is based on discussions with Adrian, David and Eric.

Thanks for bringing this back to the list. The original thread was
getting quite long.

---------------------------
Problem:
A single class can be used in multiple source files and the DI (Debug Info)
class is included in multiple bc files. The duplication of
class definitions causes blow-up in # of MDNodes, # of DIEs, leading to
large memory requirement.

As an example, SPEC xalancbmk requires 7GB of memory when compiled with
-flto -g.
With a preliminary implementation of type uniquing, the memory usage will be
down to 2GB.

In order to unique types, we have to break cycles in the MDNodes.

A simple struct definition
struct Base {
int a;
};
can cause cycles in MDNodes:
!12 = metadata !{i32 786451, metadata !13, null, metadata !"Base", i32 1,
i64 32, i64 32, i32 0, i32 0, null, metadata !14, i32 0, null, null} ; [
DW_TAG_structure_type ] [Base] [line 1, size 32, align 32, offset 0] [from ]
!14 = metadata !{metadata !15, metadata !16}
!15 = metadata !{i32 786445, metadata !13, metadata !12, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!16 = metadata !{i32 786478, metadata !13, metadata !12, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !17, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !20, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Cycles: !12 -- !14 -- !15 -- !12
       !12 -- !14 -- !16 -- !12

These cycles make it hard to unique the same struct used in two bc files.

---------------------------
How to fix:

We attach a hash value to types to help type uniquing and we also replace
references to types with their hash values.
For the above struct "Base", we now have the following MDNodes:
!4 = metadata !{i32 786451, metadata !5, null, metadata !"Base", i32 1, i64
32, i64 32, i32 0, i32 0, null, metadata !6, i32 0, i32 0, null, i32
915398439} ; [ DW_TAG_structure_type ] [Base] [line 1, size 32, align 32,
offset 0] [from ]
!6 = metadata !{metadata !7, metadata !9}
!7 = metadata !{i32 786445, metadata !5, i32 915398439, metadata !"a", i32
2, i64 32, i64 32, i64 0, i32 0, metadata !8} ; [ DW_TAG_member ] [a] [line
2, size 32, align 32, offset 0] [from int]
!9 = metadata !{i32 786478, metadata !5, i32 915398439, metadata !"Base",
metadata !"Base", metadata !"", i32 1, metadata !10, i1 false, i1 false, i32
0, i32 0, null, i32 320, i1 false, null, null, i32 0, metadata !13, i32 1} ;
[ DW_TAG_subprogram ] [line 1] [Base]

Note that the cycles are gone and !4 has a hash value of 915398439, and the
references to !4 are replaced with 915398439.
Thanks Eric for suggesting replacing MD reference with a hash value.

In particular I recommended this:

a) For C++ odr replace it with the "hash" that's just a string
representing the type name.
b) For Internal C++ types and all C types replace it with a string
that's a concatenation of the type name and the name of the compile
unit.

Yes, that is what we agreed on over email.

There are a few issues:
1> How to generate the hash for a given type?
With C++'s ODR, it should be enough by using the context and the name for
non-internal c++ types.
For internal c++ types and types of other languages, hash will not be used.

Explain this?

For a while, I am going to support both hash and MD reference, once
everything is working with hash,
I will update all debug info testing cases, turn -gtype-uniquing on, and
remove the other path.

For internal c++ types, initially they will follow the path of using MD
references without a hash.

My current implementation is to add a few static member functions in MDNode
to profile DI nodes differently.
+ /// If the array of Vals is for debug info, profile it specially and
return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(ArrayRef<Value*> Vals, FoldingSetNodeID
&ID);

+ /// If the MDNode is for debug info, profile it specially and return true.
+ /// If the DI node has a hash value, generate the profile using only the
hash value and the declaration flag.
+ static bool profileDebugInfoNode(const MDNode *M, FoldingSetNodeID &ID);

+ /// Given a hash value and a flag, generate the profile for later lookup.
+ static bool profileDebugInfoNode(unsigned Hash, bool Declaration,
FoldingSetNodeID &ID);

These static functions are called in Metadata.cpp:
void MDNode::Profile(FoldingSetNodeID &ID) const {
+ if (profileDebugInfoNode(this, ID))
+ return;
+

There are other examples of these in MDNode for handling of specific
metadata.
/// Methods for metadata merging.
static MDNode *getMostGenericTBAA(MDNode *A, MDNode *B);
static MDNode *getMostGenericFPMath(MDNode *A, MDNode *B);
static MDNode *getMostGenericRange(MDNode *A, MDNode *B);

Comments are welcome on whether this violates any layering rule.

As I've said many times in email, I don't think this is a good idea
and would prefer either a or b below. a) is a much simpler solution.

Any reason that why it is not a good idea?

Other choices are:
> Keep a map in DwarfDebug
Keep in mind that the map is used at many stages, and it has to be in sync
with MDNodeSet.
> Generalize MDNode to be aware of hash (David can provide more details)
> Extend MDNode to DINode and modify streamers (bitcode reader|writer, ll
reader>writer) to be aware of DINode
We can provide DINode::get(…) to create a DINode. DINode can have its own
Profile function.
Other suggestions are welcome.

a or b please.

Option a> will require a DwarfDebug pointer in every stage of the compiler,
and passing the map to the DI classes.
A rough estimation is around 100 places.
Is it reasonable to pass a DwarfDebug pointer to DIBuilder and llvm linker?
Also the map needs to be in sync with MDNodeSet, maybe using ValueHandle can
solve the problem.

What about putting the map in LLVMContextImpl?
It already has a few things specifically for debug info:
std::vector<DebugRecVH> ScopeRecords;
DenseMap<std::pair<MDNode*, MDNode*>, int> ScopeInlinedAt;

I remember David mentioned it once and I forgot about the conclusion.

I mentioned it only as speculation as to how you were implementing it
already (but you were doing the profile-changing stuff).

I don't think it should be necessary to have the map (in option (a))
in such a central location as LLVMContext. It should be usable just
from DwarfDebug for generation, and DIBuilder can have its own,
separate map to do similar things during DI building.

We also need the map during llvm linking since linking will create new
MDNodes.

I don't understand what you mean by this. IR linking shouldn't need to
do anything debug-info-specific, it should just be the normal IR
approach.

The declaration-v-definition resolution can be done at codegen-time.
We'll have to walk all the lists of retained types anyway to build the
map, so we can do declaration-v-definition (keep definitions over
declarations when we see both) at that point.

Yes, you are right that at codegen-time, we can generate the map from the
lists
of retained types.

But dumping the linked ll file requires the map when outputting comments of
the MDNode :]

Depending on which things we print out, but yes, in some cases
(derived types) we do print out the type referenced. I assume the
AsmPrinter can build such a map, then. (in fact, with a few clients
like this, it might be nice to build a bit of an abstraction around it
rather than just using a raw map - something that has a ctor (or I
suppose it could be a factory function) that reads in the right
metadata, walks the compile units, etc, and builds the mapping)

A call to MDNode::dump() will require the map to print out the comments.
To continue supporting dump(), we need to have the map at almost all times.

To continue supporting dump exactly as it works today we would, but
given that the type identifiers are human readable & fairly
self-explanatory, probably just printing out that string would suffice
in the few cases where we navigate across type references (& the only
one I can think of is for DIDerivedTypes where we print "[from foo]"
for the type it's derived from (& even that's not very smart - doesn't
work at all for subroutine types, nested derived types, etc))

A summary of options for issue #3:
3> To actually access the MDNode referenced via the hash value, we need to perform a lookup from the hash value to find the corresponding MDNode.
The questions are where to store this map and how to keep it up-to-date when a MDNode is replaced.

Wouldn’t it be simpler to name the metadata based on the hash of the content? Then you could use a normal reference to that metadata without needing to create a new type or teach the rest of llvm how to use it…

Wouldn’t it be simpler to name the metadata based on the hash of the content? Then you could use a normal reference to that metadata without needing to create a new type or teach the rest of llvm how to use it…

Sounds reasonable (probably with some nice prefix to get these things a namespace). Though this applies to both (a) and (b) equally.

More details please :]What do you mean by “name the metadata”? Are you referring to the name field of the MDNode?

Thanks,
Manman

More details please :]
What do you mean by "name the metadata"? Are you referring to the name field
of the MDNode?

Using named metadata rather than unnamed metadata.

rather than having:

!llvm.hardref = !{metadata !"foo.h::myClass", !3, metadata
!"bar.h:myEnum", !4} ...
!3 = ...;
!4 = ...;

we could simply have:

!llvm.hardref.foo.h.myClass = ...;
!llvm.hardref.bar.h.myClass = ...;

or something like that.

(assuming solution (b), in solution (a) it'd look more like
"llvm.dbg.type.foo.h.myClass" - but the same idea (then we could skip
using the retained types list, avoid adding a new 'name' member to
DITypes or changing the schema of the retained types list to be
string/mdnode pairs))

I like this idea! It's much easier to read than metadata fields, but will
it need to have mangled names that today, are quoted?

Do we still have the notion that named metadata cannot be deleted?

cheers,
--renato