Huge mangled names are causing long delays when loading symbol table symbols

I have an issue where I am debugging a C++ binary that is around 250MB in size. It contains some mangled names that are crazy:

_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiEEEESI_S7_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_EEEEESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESt6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_

This de-mangles to something that is 72MB in size and takes 280 seconds (try running "time c++filt -n" on the above string).

There are probably many symbols likes this in this binary. Currently lldb will de-mangle all names in the symbol table so that we can chop up the names so we know function base names and we might be able to classify a base name as a method or function for breakpoint categorization.

My questions is: how do we work around such issues in LLDB? A few solutions I can think of:
1 - time each name demangle and if it takes too long somehow stop de-mangling similar symbols or symbols over a certain length?
2 - allow a setting that says "don't de-mangle names that start with..." and the setting has a list of prefixes.
3 - have a setting that turns off de-mangling symbols over a certain length all of the time with a default of something like 256 or 512
4 - modify our FastDemangler to abort if the de-mangled string goes over a certain limit to avoid bad cases like this...

#1 would still mean we get a huge delay (like 280 seconds) when starting to debug this binary, but might prevent multiple symbols from adding to that delay...

#2 would require debugging debugging once and then knowing which symbols took a while to de-mangle. If we time each de-mangle, we can warn that there are large mangled names and print the mangled name so the user might know?

#3 would disable de-mangling of long names at the risk of not de-mangling names that are close to the limit

#4 requires that our FastDemangle code can decode the string mangled string. The fast de-mangler currently aborts on tricky de-mangling and we fall back onto cxa_demangle from the C++ library which doesn't not have a cutoff on length...

Can anyone else think of any other solutions?

Greg Clayton

What about doing a partial demangle? Take at most 1024 (for example) characters from the mangled name, demangle that, and then display … at the end.

If you just cut off the string, then it might not demangle without an error if you truncate the mangled string at a specific point…

That’s true, but shouldn’t it be possible to demangle up until the last point you got something meaningful? (I don’t know the details of itanium mangling, just assuming this is possible)

That’s true, but shouldn’t it be possible to demangle up until the last point you got something meaningful? (I don’t know the details of itanium mangling, just assuming this is possible)

anywhere you cut the string many things can go wrong. I think this would fall under the "start to demangle the string and if the output buffer goes over a certain length, abort the demangling which is solution #4 from my original email.

The mangled name length threshold would be the easiest to implement.
However, I fear we may not be able to find a good cutoff length,
because it's not the length of it that matters, but the number (and
recursiveness) of back-references. For example, I was able to find a
mangled name of 757 characters in lldb:
_ZN12lldb_private23ScriptInterpreterPython21InitializeInterpreterEPFvvEPFbPKcS4_RKSt10shared_ptrINS_10StackFrameEERKS5_INS_18BreakpointLocationEEEPFbS4_S4_S9_RKS5_INS_10WatchpointEEEPFbS4_PvRKNS_10SharingPtrINS_11ValueObjectEEEPSM_RKS5_INS_18TypeSummaryOptionsEERNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEEPFSM_S4_S4_SR_EPFSM_S4_S4_S5_INS_8DebuggerEEEPFmSM_jEPFSM_SM_jEPFiSM_S4_EPFSM_SM_EPFSP_SM_EPFbSM_ES1N_S1J_PFbS4_S4_RS19_S4_RNS_19CommandReturnObjectES5_INS_19ExecutionContextRefEEEPFbSM_S1O_S4_S1Q_S1S_EPFbS4_S4_S1O_EPFSM_S4_S4_RKS5_INS_7ProcessEEEPFbS4_S4_RS20_S13_EPFbS4_S4_RS5_INS_6ThreadEES13_EPFbS4_S4_RS5_INS_6TargetEES13_EPFbS4_S4_RS7_S13_EPFbS4_S4_RSP_S13_EPFSM_SM_S4_RKS2E_EPFSM_S4_S4_RKS5_INS_10ThreadPlanEEEPFbSM_S4_PNS_5EventERbE

This demangles string of lenght 2534 and I think it would be good to
handle it. On the other hand, I was able to produce a mangled name of
only 168 characters:
_ZN1BIS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IiiES0_ES1_ES2_ES3_ES4_ES5_ES6_ES7_ES8_ES9_ESA_ESB_ESC_ESD_ESE_ESF_ESG_ESH_ESI_ESJ_ESK_ESL_E1fEv
which demanges to a 70MB string. (It takes about 3 seconds to compile
a file with this symbol and 0.8s to demangle it).

So we may need limit the on the output buffer size instead, but this
will require cooperation from the demangling library. Fortunately, all
targets nowadays use either the "fast" demangler or
llvm::itaniumDemangle by default, which we can modify to add a
threshold like this.

pl

Hi,
I'm not at all familiar with LLDB, but I've been doing some work on the demangler in libcxxabi. It's still a work in progress and I haven't yet copied the changes over to ItaniumDemangle, which AFAIK is what lldb uses. The demangler in libcxxabi now demangles the symbol you attached in 3.31 seconds, instead of 223.54 on my machine. I posted a RFC on my work here (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html), but basically the new demangler just produces an AST then traverses it to print the demangled name.

I think a good way of making this even faster is to have LLDB consume the AST the demangler produces directly. The AST is a better representation of the information that LLDB wants, and finishing the demangle and then fishing out that information from the output string is unfortunate. From the AST, it would be really straightforward to just individually print all the components of the name that LLDB wants.

Most of the time it takes to demangle these "symbols from hell" is during the printing, after the AST has been parsed, because the demangler has to flatten out all the potentially nested back references. Just parsing to an AST should be about proportional to the strlen of the mangled name. Since (AFAIK) LLDB doesn't use some sections of the demangled name often (such as parameters), from the AST LLDB could lazily decide not to even bother fully demangling some sections of the name, then if it ever needs them it could parse a new AST and get them from there. I think this would largely fix the issue, as most of the time these crazy expansions don't occur in the name itself, but in the parameters or return type. Even when they do appear in the name, it would be possible to do some simple name classification (ie, does this symbol refer to a function) or pull out the basename quickly without expanding anything at all.

Any thoughts? I'm really not at all familiar with LLDB, so I could have this all wrong!

Thanks,
Erik

I must admit I've never played around with C++ demangling, but I wonder if our purposes in demangling might inform how we do this?

We use demangled names for a couple of purposes. One is to print names in backtraces and thread reporting when we stop. For the most part the requests we've gotten for this is that the full demangled names are too noisy and impossible to read and we need to cut them down for usability's sake. For instance, we added a display mode to the swift demangler so that backtraces were actually useful. But in any case, this part can be done lazily when a name shows up in a backtrace, and so is not so performance sensitive.

The other reason we use them is to allow the various name lookups to work with human-level names (often partially specialized) and find their way to the actual symbols. This is generally why we have to do mass demangling of symbols when we read in a module. Having a full demangled name here does allow folks to specify a particular overload (for setting breakpoints, etc.) but that part of our symbol lookups is more frustrating than helpful because you have to know pretty much exactly how the compiler spelled the demangled name, at which point it's generally easier just to use the mangled name.

So I wonder if it wouldn't be possible to make a demangle that doesn't attempt full fidelity, but rather is crafted to pick out the pieces that we actually need and use to do heuristic name matches, and then we could use the faithful demangler when we are intentionally presenting a name - at which point the speed will be much less important.

I'm probably missing some uses of demangled names that might not make this possible, but it seems worth considering.

Jim

specialized -> specified

Jim

That's along the same lines as what I was thinking. We really don't need to print all these names, and in fact the complicated ones are not useful for printing and certainly there are few times where you want to use them in their explicit forms. We really just want to pick out pieces to put in our names tables for lookup. So if we can get them in some kind of node form and then pull the bits we want out that might be a better way to go.

Jim

Hi,
I'm not at all familiar with LLDB, but I've been doing some work on the demangler in libcxxabi. It's still a work in progress and I haven't yet copied the changes over to ItaniumDemangle, which AFAIK is what lldb uses. The demangler in libcxxabi now demangles the symbol you attached in 3.31 seconds, instead of 223.54 on my machine. I posted a RFC on my work here ([llvm-dev] RFC: Cleaning up the Itanium demangler), but basically the new demangler just produces an AST then traverses it to print the demangled name.

Great to hear the huge speedup in demangling! LLDB actually has two demanglers: a fast one that can demangle 99% of names, and we fall back to ItaniumDemangle which can do all names but is really slow. It would be fun to compare your new demangler with the fast one and see if we can get rid of the fast demangler now.

I think a good way of making this even faster is to have LLDB consume the AST the demangler produces directly. The AST is a better representation of the information that LLDB wants, and finishing the demangle and then fishing out that information from the output string is unfortunate. From the AST, it would be really straightforward to just individually print all the components of the name that LLDB wants.

This would help us to grab the important bits out of the mangled name as well. We chop up a demangled name to find the base name (string for std::string), containing context (std:: for std::string) and we check if we can tell if the function is a method (look for trailing "const" modifier on the function) versus a top level function (since the mangling doesn't fully specify what is a namespace and what is a class (like in "foo::bar::baz()" we don't know if "foo" or "bar" are classes or namespaces. So the AST would be great as long as it is fast.

Most of the time it takes to demangle these "symbols from hell" is during the printing, after the AST has been parsed, because the demangler has to flatten out all the potentially nested back references. Just parsing to an AST should be about proportional to the strlen of the mangled name. Since (AFAIK) LLDB doesn't use some sections of the demangled name often (such as parameters), from the AST LLDB could lazily decide not to even bother fully demangling some sections of the name, then if it ever needs them it could parse a new AST and get them from there. I think this would largely fix the issue, as most of the time these crazy expansions don't occur in the name itself, but in the parameters or return type. Even when they do appear in the name, it would be possible to do some simple name classification (ie, does this symbol refer to a function) or pull out the basename quickly without expanding anything at all.

Any thoughts? I'm really not at all familiar with LLDB, so I could have this all wrong!

AST sounds great. We can put this into the class we use to chop us C++ names as that is really our goal.

So it would be great to do a speed comparison between our fast demangler in LLDB (in FastDemangle.cpp/.h) and your updated libcxxabi version. If yours is faster, remove FastDemangle and then update the llvm::ItaniumDemangle() to use your new code.

ASTs would be great for the C++ name parser,

Let us know what you are thinking,

Greg

Hi,
I'm not at all familiar with LLDB, but I've been doing some work on the demangler in libcxxabi. It's still a work in progress and I haven't yet copied the changes over to ItaniumDemangle, which AFAIK is what lldb uses. The demangler in libcxxabi now demangles the symbol you attached in 3.31 seconds, instead of 223.54 on my machine. I posted a RFC on my work here ([llvm-dev] RFC: Cleaning up the Itanium demangler), but basically the new demangler just produces an AST then traverses it to print the demangled name.

Great to hear the huge speedup in demangling! LLDB actually has two demanglers: a fast one that can demangle 99% of names, and we fall back to ItaniumDemangle which can do all names but is really slow. It would be fun to compare your new demangler with the fast one and see if we can get rid of the fast demangler now.

I think a good way of making this even faster is to have LLDB consume the AST the demangler produces directly. The AST is a better representation of the information that LLDB wants, and finishing the demangle and then fishing out that information from the output string is unfortunate. From the AST, it would be really straightforward to just individually print all the components of the name that LLDB wants.

This would help us to grab the important bits out of the mangled name as well. We chop up a demangled name to find the base name (string for std::string), containing context (std:: for std::string) and we check if we can tell if the function is a method (look for trailing "const" modifier on the function) versus a top level function (since the mangling doesn't fully specify what is a namespace and what is a class (like in "foo::bar::baz()" we don't know if "foo" or "bar" are classes or namespaces. So the AST would be great as long as it is fast.

Most of the time it takes to demangle these "symbols from hell" is during the printing, after the AST has been parsed, because the demangler has to flatten out all the potentially nested back references. Just parsing to an AST should be about proportional to the strlen of the mangled name. Since (AFAIK) LLDB doesn't use some sections of the demangled name often (such as parameters), from the AST LLDB could lazily decide not to even bother fully demangling some sections of the name, then if it ever needs them it could parse a new AST and get them from there. I think this would largely fix the issue, as most of the time these crazy expansions don't occur in the name itself, but in the parameters or return type. Even when they do appear in the name, it would be possible to do some simple name classification (ie, does this symbol refer to a function) or pull out the basename quickly without expanding anything at all.

Any thoughts? I'm really not at all familiar with LLDB, so I could have this all wrong!

AST sounds great. We can put this into the class we use to chop us C++ names as that is really our goal.

So it would be great to do a speed comparison between our fast demangler in LLDB (in FastDemangle.cpp/.h) and your updated libcxxabi version. If yours is faster, remove FastDemangle and then update the llvm::ItaniumDemangle() to use your new code.

ASTs would be great for the C++ name parser,

Let us know what you are thinking,

Hi Greg,

I'll almost finished with my work on the demangler, hopefully I'll be done within a few weeks. Once that's all finished I'll look into exporting the AST and comparing it to FastDemangle. I was thinking about adding a version of llvm::itaniumMangle() that returns a opaque handle to the AST and defining some functions on the LLVM side that take that handle and return some extra information. I'd be happy to help out with the LLDB side of things too, although it might be better if someone more experienced with LLDB did this.

I'll ping this thread when I'm finished with the demangler, then we can hopefully work out what a good API for LLDB would be.

Thanks,
Erik

That's great to hear. Not having 3 different demanglers scattered
between lldb and llvm will be a big win for everybody.

It's not just reduction of the number of demanglers we have to support, however.

Greg and I both got excited by this proposal because we've had to maintain these name choppers for the tasks lldb has to do with mangled names - for instance matching incomplete human-typed in names - i.e. Class::method to match a method which is actually Namespace::Class::method.

Having a structured representation of mangled names will be much more appropriate for the tasks lldb has to do with mangled names - for instance matching incomplete human-typed in names - i.e. Class::method to match a method which is actually Namespace::Class::method. At present, we end up losing all the semantic information the demangler had when it parsed the mangled name, then trying to recreate that by hand to pick out the pieces of interest.

Greg did an experiment early on in lldb of having a node tree representation of mangled names, but it was too slow when you have to use it on every symbol in a module. That's an important thing to remember for the debugger's use of the demangler. Since we need to quickly find Namespace::Class::method when a somebody types Class::method we have to build up lookup tables up front for those pieces, and we don't always have debug information from which to grab the base name. So whatever demangler we use has to survive getting passed all the C++ symbols in the libraries loaded by a normal program.

Another bonus of this work: we have the problem that a 700 character demangled name is just not useful in a backtrace. If you have 20 frames of this one after another the display is really just noise... We do some truncation of names, but figuring out how to truncate a name while preserving the parts that are actually useful to people is hard to do well if you don't understand the semantics of the name. Erik Eckstein added a "display mode" to the swift demangler which only renders the most salient parts of the name. The swift demangler does parse into a node tree so this was doable. That made a big difference in the readability of backtraces in swift. This is plumbed through the generic parts of lldb (Symbol::GetDisplayName & Mangled::GetDisplayDeangledName) but for C++ GetDisplayDemangledName just calls GetDemangled name. It would be great to implement some reasonable version of this for C++ names as well.

Jim

Hi,
I’m not at all familiar with LLDB, but I’ve been doing some work on the demangler in libcxxabi. It’s still a work in progress and I haven’t yet copied the changes over to ItaniumDemangle, which AFAIK is what lldb uses. The demangler in libcxxabi now demangles the symbol you attached in 3.31 seconds, instead of 223.54 on my machine. I posted a RFC on my work here (http://lists.llvm.org/pipermail/llvm-dev/2017-June/114448.html), but basically the new demangler just produces an AST then traverses it to print the demangled name.

Great to hear the huge speedup in demangling! LLDB actually has two demanglers: a fast one that can demangle 99% of names, and we fall back to ItaniumDemangle which can do all names but is really slow. It would be fun to compare your new demangler with the fast one and see if we can get rid of the fast demangler now.

I think a good way of making this even faster is to have LLDB consume the AST the demangler produces directly. The AST is a better representation of the information that LLDB wants, and finishing the demangle and then fishing out that information from the output string is unfortunate. From the AST, it would be really straightforward to just individually print all the components of the name that LLDB wants.

This would help us to grab the important bits out of the mangled name as well. We chop up a demangled name to find the base name (string for std::string), containing context (std:: for std::string) and we check if we can tell if the function is a method (look for trailing “const” modifier on the function) versus a top level function (since the mangling doesn’t fully specify what is a namespace and what is a class (like in “foo::bar::baz()” we don’t know if “foo” or “bar” are classes or namespaces. So the AST would be great as long as it is fast.

Most of the time it takes to demangle these “symbols from hell” is during the printing, after the AST has been parsed, because the demangler has to flatten out all the potentially nested back references. Just parsing to an AST should be about proportional to the strlen of the mangled name. Since (AFAIK) LLDB doesn’t use some sections of the demangled name often (such as parameters), from the AST LLDB could lazily decide not to even bother fully demangling some sections of the name, then if it ever needs them it could parse a new AST and get them from there. I think this would largely fix the issue, as most of the time these crazy expansions don’t occur in the name itself, but in the parameters or return type. Even when they do appear in the name, it would be possible to do some simple name classification (ie, does this symbol refer to a function) or pull out the basename quickly without expanding anything at all.

Any thoughts? I’m really not at all familiar with LLDB, so I could have this all wrong!

AST sounds great. We can put this into the class we use to chop us C++ names as that is really our goal.

So it would be great to do a speed comparison between our fast demangler in LLDB (in FastDemangle.cpp/.h) and your updated libcxxabi version. If yours is faster, remove FastDemangle and then update the llvm::ItaniumDemangle() to use your new code.

ASTs would be great for the C++ name parser,

Let us know what you are thinking,

Hi Greg,

I’ll almost finished with my work on the demangler, hopefully I’ll be done within a few weeks. Once that’s all finished I’ll look into exporting the AST and comparing it to FastDemangle. I was thinking about adding a version of llvm::itaniumMangle() that returns a opaque handle to the AST and defining some functions on the LLVM side that take that handle and return some extra information. I’d be happy to help out with the LLDB side of things too, although it might be better if someone more experienced with LLDB did this.

Can’t wait! The only reason we switched away from the libcxxabi demangler in the first place was the poor performance. GDB’s demangler was 3x faster. Our FastDemangler made got back to the speed of the GDB demangler. But it will be great to get back to one fast demangler.

It would be great if there was some way to implement the demangled name size cutoff in the demangler where if the detangled names goes over some max size we can just stop demangling. No one needs to see a 72MB string, not would anyone ever type in that name.

If you can get the new demangler features (AST + demangling) into llvm::itaniumMangle I will be happy to do the LLDB side of the work

I’ll ping this thread when I’m finished with the demangler, then we can hopefully work out what a good API for LLDB would be.

It would be great to put all the functionality into LLVM and test the functionality in llvm tests. Then I will port over to LLDB as needed. As Jim said, we want to know the function basename, if a function is a C++ method or just a top level function or possibly both (we often don’t know just from mangling if foo::bar() is a method of function since we don’t know if “foo” is a namespace, but if we have “foo::bar() const”, then we know it is a method.

Look forward to seeing what you come up with!

Greg

Greg, sorry I have been trying to work out my idea, and to be honest the demangler is complicated.

The core of my idea is to not fully expand the mangled name.

The key benefit in switching to the “Itanium” mangling scheme was the ability to reduce the amount
of totally redundant information.

I started with thinking about the shorter string:

_ZN1BIS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IS_IiiES0_ES1_ES2_ES3_ES4_ES5_ES6_ES7_ES8_ES9_ESA_ESB_ESC_ESD_ESE_ESF_ESG_ESH_ESI_ESJ_ESK_ESL_E1fEv

which shows the exponential expansion problem.

This problem has been pointed out with good detail here: Demangling C++ Symbols

My thinking is to show the user this redundancy in a more explict form something like:

B<$22={B<$21={B<$20={B<$19={B<$18={B<$17={B<$16={B<$15={B<$14={B<$13={B<$12={B<$11={B<$10={B<$9={B<$8={B<$7={B<$6={B<$5={B<$4={B<$3={B<$2={B<$1={B<int, int>}, $1>}, $2>}, $3>}, $4>}, $5>}, $6>}, $7>}, $8>}, $9>}, $10>}, $11>}, $12>}, $13>}, $14>}, $15>}, $16>}, $17>}, $18>}, $19>}, $20>}, $21>}, $22>::f()

Doing this by hand is time consuming and error prone, so I went looking for a demangler I could quickly modify,
anyway I don’t have access to the one I wrote at Metrowerks any more, and I ended up porting the one from LLVM
to swift (my port was a bit of a mess but I wanted more perl like string functions, and never really learned libc++
at that level).

While doing that I have seen Erik’s AST changes land, and they are really nice, it makes the demangler much
easier to understand. I switched to building an AST and would suggest a couple of improvements:

1 - introduce DEF/USE nodes into the AST tree

This allows while walking the AST tree to print indirections instead of fully expanding them.

2 - use back patching (via DEF/USE nodes) to avoid double parsing the string

In some cases “T” substitutions are referenced before their expansion is defined,
by being able to see the use node at the point the def is actually created allows
me to patch up the use with the actual expansion.

3 - modify the AST so that all nodes have an array of children (which could be empty)

This makes walking the AST simpler as we don’t have to customize each ASTNode subclass
to allow walking tree.

I did a few other things (add a StringNode so I don’t need a NodeOrStringNode), make a node which
holds an array of nodes… Tweeks in the way ParameterPackExpansion works…

Once 1 & 2 are done it is possible to experiment with heuristics for the expansion:

First is printing out all DEF nodes and USE nodes:

$23=$0=B<$T0.2=$22=$0<$T0.3=$21=$0<$T0.4=$20=$0<$T0.5=$19=$0<$T0.6=$18=$0<$T0.7=$17=$0<$T0.8=$16=$0<$T0.9=$15=$0<$T0.10=$14=$0<$T0.11=$13=$0<$T0.12=$12=$0<$T0.13=$11=$0<$T0.14=$10=$0<$T0.15=$9=$0<$T0.16=$8=$0<$T0.17=$7=$0<$T0.18=$6=$0<$T0.19=$5=$0<$T0.20=$4=$0<$T0.21=$3=$0<$T0.22=$2=$0<$T0.23=$1=$0<$T0.24=int, $T1.24=int>, $T1.23=$1>, $T1.22=$2>, $T1.21=$3>, $T1.20=$4>, $T1.19=$5>, $T1.18=$6>, $T1.17=$7>, $T1.16=$8>, $T1.15=$9>, $T1.14=$10>, $T1.13=$11>, $T1.12=$12>, $T1.11=$13>, $T1.10=$14>, $T1.9=$15>, $T1.8=$16>, $T1.7=$17>, $T1.6=$18>, $T1.5=$19>, $T1.4=$20>, $T1.3=$21>, $T1.2=$22>::f()

Clearly this is a mess but helped me a bit when debugging, and I learned:

S substitutions are global
T substitutions are scoped (so I added a generation count, currently prints are $T<id>.<gen>, perhaps I would change to $T<gen>.<id>)

First improvement is to only print a def that has at least one use:

$0={B}<$22={$0<$21={$0<$20={$0<$19={$0<$18={$0<$17={$0<$16={$0<$15={$0<$14={$0<$13={$0<$12={$0<$11={$0<$10={$0<$9={$0<$8={$0<$7={$0<$6={$0<$5={$0<$4={$0<$3={$0<$2={$0<$1={$0<int, int>}, $1>}, $2>}, $3>}, $4>}, $5>}, $6>}, $7>}, $8>}, $9>}, $10>}, $11>}, $12>}, $13>}, $14>}, $15>}, $16>}, $17>}, $18>}, $19>}, $20>}, $21>}, $22>::f()

This is better but I don’t like all of the “B”s being replaced with “$0”

Second improvement is to only print a def which has a use nested in it's definition (this is like what I showed above):

B<$22={B<$21={B<$20={B<$19={B<$18={B<$17={B<$16={B<$15={B<$14={B<$13={B<$12={B<$11={B<$10={B<$9={B<$8={B<$7={B<$6={B<$5={B<$4={B<$3={B<$2={B<$1={B<int, int>}, $1>}, $2>}, $3>}, $4>}, $5>}, $6>}, $7>}, $8>}, $9>}, $10>}, $11>}, $12>}, $13>}, $14>}, $15>}, $16>}, $17>}, $18>}, $19>}, $20>}, $21>}, $22>::f()

Third improvement is to pull the DEFs out and put them in a list <def, def…> symbol list like this:

<$1=B<int, int>, $2=B<$1, $1>, $3=B<$2, $2>, $4=B<$3, $3>, $5=B<$4, $4>, $6=B<$5, $5>, $7=B<$6, $6>, $8=B<$7, $7>, $9=B<$8, $8>, $10=B<$9, $9>, $11=B<$10, $10>, $12=B<$11, $11>, $13=B<$12, $12>, $14=B<$13, $13>, $15=B<$14, $14>, $16=B<$15, $15>, $17=B<$16, $16>, $18=B<$17, $17>, $19=B<$18, $18>, $20=B<$19, $19>, $21=B<$20, $20>, $22=B<$21, $21>> B<$22, $22>::f()

This is better as I can see the symbol is B<T, T>::f(), I find multiline easier to read in this case:

$1=B<int, int>
$2=B<$1, $1>
$3=B<$2, $2>
$4=B<$3, $3>
$5=B<$4, $4>
$6=B<$5, $5>
$7=B<$6, $6>
$8=B<$7, $7>
$9=B<$8, $8>
$10=B<$9, $9>
$11=B<$10, $10>
$12=B<$11, $11>
$13=B<$12, $12>
$14=B<$13, $13>
$15=B<$14, $14>
$16=B<$15, $15>
$17=B<$16, $16>
$18=B<$17, $17>
$19=B<$18, $18>
$20=B<$19, $19>
$21=B<$20, $20>
$22=B<$21, $21>
B<$22, $22>::f()

But this may have issues for example in stack unwinding.

This does however make really clear to the user the recursive nature of the type definition.

I can apply this to the original string, and as long as I don’t try to fully expand it, it runs fairly fast (even in unoptimized swift)

_ZNK3shk6detail17CallbackPublisherIZNS_5ThrowERKNSt15__exception_ptr13exception_ptrEEUlOT_E_E9SubscribeINS0_9ConcatMapINS0_18CallbackSubscriberIZNS_6GetAllIiNS1_IZZNS_9ConcatMapIZNS_6ConcatIJNS1_IZZNS_3MapIZZNS_7IfEmptyIS9_EEDaS7_ENKUlS6_E_clINS1_IZZNS_4TakeIiEESI_S7_ENKUlS6_E_clINS1_IZZNS_6FilterIZNS_9ElementAtEmEUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZZNSL_ImEESI_S7_ENKUlS6_E_clINS1_IZNS_4FromINS0_22InfiniteRangeContainerIiEEEESI_S7_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESI_S6_EUlS7_E_EESI_S7_ENKUlS6_E_clIS14_EESI_S6_EUlS7_E_EERNS1_IZZNSH_IS9_EESI_S7_ENKSK_IS14_EESI_S6_EUlS7_E0_EEEEESI_DpOT_EUlS7_E_EESI_S7_ENKUlS6_E_clINS1_IZNS_5StartIJZNS_4JustIJS19_S1C_EEESI_S1F_EUlvE_ZNS1K_IJS19_S1C_EEESI_S1F_EUlvE0_EEESI_S1F_EUlS7_E_EEEESI_S6_EUlS7_E_EEEESt6vectorIS6_SaIS6_EERKT0_NS_12ElementCountEbEUlS7_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlOS3_E_ZNSD_IiS1Q_EES1T_S1W_S1X_bEUlvE_EES1G_S1O_E25ConcatMapValuesSubscriberEEEDaS7_

Showing the DEFs inline:

auto $10=$2=$1=shk::detail::CallbackPublisher<$T0.2=shk::Throw($4=std::__exception_ptr::exception_ptr const&)::'lambda'($8=$7=$T0.2&&)>::Subscribe<$1::ConcatMap<$1::CallbackSubscriber<$70=std::vector<$7, std::allocator<$7> > $14=shk::GetAll<int, $T1.6=$2<$61=$19 $19 shk::ConcatMap<$52=$19 shk::Concat<$T0.9=$2<$19 $19 shk::Map<$41=$19 auto $18=shk::IfEmpty<$10>($8)::$19='lambda'($7)::operator()<$2<$19 $19 $21=shk::Take<int>($8)::$22='lambda'($7)::operator()<$2<$19 $19 shk::Filter<shk::ElementAt(unsigned long)::'lambda'($8)>($8)::'lambda'($7)::operator()<$2<$19 $19 $22<unsigned long>($8)::'lambda'($7)::operator()<$2<$19 shk::From<$1::InfiniteRangeContainer<int> >($8)::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)>($8)::'lambda'($7)::operator()<$41>($7) const::'lambda'($8)>, $2<$46=$19 $19 $18<$10>($8)::$21<$41>($7) const::'lambda0'($8)>&>($49=$2<$19 $19 shk::Map<$41=$19 auto $18=shk::IfEmpty<$10>($8)::$19='lambda'($7)::operator()<$2<$19 $19 $21=shk::Take<int>($8)::$22='lambda'($7)::operator()<$2<$19 $19 shk::Filter<shk::ElementAt(unsigned long)::'lambda'($8)>($8)::'lambda'($7)::operator()<$2<$19 $19 $22<unsigned long>($8)::'lambda'($7)::operator()<$2<$19 shk::From<$1::InfiniteRangeContainer<int> >($8)::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)>($8)::'lambda'($7)::operator()<$41>($7) const::'lambda'($8)>&&, $49=$2<$46=$19 $19 $18<$10>($8)::$21<$41>($7) const::'lambda0'($8)>&&&)::'lambda'($8)>($8)::$53='lambda'($7)::operator()<$2<$19 shk::Start<$57=$19 shk::Just<$46, $49>($52)::'lambda'(), $19 $57<$46, $49>($52)::'lambda0'()>($52)::'lambda'($8)> >($7) const::'lambda'($8)> >($66=$T1.6 const&, $69=shk::ElementCount, bool)::'lambda'($8), $66 $14<int, std::vector>($69, $70, bool)::'lambda'($4&&), $66 $14<int, std::vector>($69, $70, bool)::'lambda'()>, $53, $61>::ConcatMapValuesSubscriber>($8) const

Showing the DEFs first:

$T0.2=shk::Throw($4 const&)::'lambda'($8),
$T0.9=$2<$19 $19 shk::Map<$41>($8)::'lambda'($7)::operator()<$41>($7) const::'lambda'($8)>, $2<$46>&,
$T1.6=$2<$61>,
$1=shk::detail,
$2=$1::CallbackPublisher,
$4=std::__exception_ptr::exception_ptr,
$7=$T0.2,
$8=$7&&,
$10=$2<$T0.2>,
$14=shk::GetAll,
$18=shk::IfEmpty,
$19='lambda'($7),
$21=shk::Take,
$22='lambda'($7),
$41=$19 auto $18<$10>($8)::$19::operator()<$2<$19 $19 $21<int>($8)::$22::operator()<$2<$19 $19 shk::Filter<shk::ElementAt(unsigned long)::'lambda'($8)>($8)::'lambda'($7)::operator()<$2<$19 $19 $22<unsigned long>($8)::'lambda'($7)::operator()<$2<$19 shk::From<$1::InfiniteRangeContainer<int> >($8)::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8)> >($7) const::'lambda'($8),
$46=$19 $19 $18<$10>($8)::$21<$41>($7) const::'lambda0'($8),
$46=$19 $19 $18<$10>($8)::$21<$41>($7) const::'lambda0'($8),
$49=$2<$46>&,
$52=$19 shk::Concat<$T0.9>($49&&, $49&&)::'lambda'($8),
$53='lambda'($7),
$57=$19 shk::Just<$46, $49>($52)::'lambda'(),
$61=$19 $19 shk::ConcatMap<$52>($8)::$53::operator()<$2<$19 shk::Start<$57, $19 $57<$46, $49>($52)::'lambda0'()>($52)::'lambda'($8)> >($7) const::'lambda'($8),
$66=$T1.6,
$69=shk::ElementCount,
$70=std::vector<$7, std::allocator<$7> > $14<int, $T1.6>($66 const&, $69, bool)::'lambda'($8)
auto $10::Subscribe<$1::ConcatMap<$1::CallbackSubscriber<$70, $66 $14<int, std::vector>($69, $70, bool)::'lambda'($4&&), $66 $14<int, std::vector>($69, $70, bool)::'lambda'()>, $53, $61>::ConcatMapValuesSubscriber>($8) const

At this point I don’t know that the function actually was so I don’t know if this translation is even valid

My implementation seems to have a recursive cycle in it:

$T0.2=shk::Throw($4 const&)::'lambda'($8),
$8=$7&&,
$7=$T0.2,
$T0.2=shk::Throw($4 const&)::'lambda'($8),

I have not gotten to the bottom of this to understand if that cycle is really there or if there is a bug in
how I am constructing the AST in this case.

Looking at some of the test cases there are some things I would like to see better like this one:

_ZZN4NIds4NStr14TCStrAggregateINS0_13TCTCStrTraitsINS0_11TCStrTraitsIcNS0_17CDefaultStrParamsEEENS0_14TCStrImp_FixedIS5_Lx256EEEEEE21f_AddFromIteratorUTF8INS0_16CStrIteratorUTF8EEEvRxRKT_ENKSA_ISB_EUlmE0_clEm:

void $11=$1=NIds::NStr::TCStrAggregate<$1::TCTCStrTraits<$6=$1::TCStrTraits<char, $1::CDefaultStrParams>, $1::TCStrImp_Fixed<$6, 256ll> > >::f_AddFromIteratorUTF8<$T0.6=$12=$1::CStrIteratorUTF8>(long long&, $T0.6 const&)::$11<$12>::'lambda0'(unsigned long)::operator()(unsigned long) const

$T0.6=$12
$1=NIds::NStr
$6=$1::TCStrTraits<char, $1::CDefaultStrParams>
$11=$1::TCStrAggregate<$1::TCTCStrTraits<$6, $1::TCStrImp_Fixed<$6, 256ll> > >::f_AddFromIteratorUTF8
$12=$1::CStrIteratorUTF8
void $11<$T0.6>(long long&, $T0.6 const&)::$11<$12>::'lambda0'(unsigned long)::operator()(unsigned long) const

I would prefer to expand out the NIds::NStr so it was:

$T0.6=$12
$6=NIds::NStr::TCStrTraits<char, $1::CDefaultStrParams>
$11=NIds::NStr::TCStrAggregate<$1::TCTCStrTraits<$6, NIds::NStr::TCStrImp_Fixed<$6, 256ll> > >::f_AddFromIteratorUTF8
$12=NIds::NStr::CStrIteratorUTF8
void $11<$T0.6>(long long&, $T0.6 const&)::$11<$12>::'lambda0'(unsigned long)::operator()(unsigned long) const

Anyway I think that converting to a non-recursive expansion would allow converting the problem cases from O(quadratic) to O(n)

Some thoughts:

could the debugger keep track of symbol definitions it has seen, this may be useful in stack crawls so as it demangles the symbols,
it can share a substation table with the demangler so substitutions have the same name between stack frames.

could the compiler (if it does not already) record type aliases so the demangler could translate back to the users name?

typedef vector<int> myIntBefore

This is of course programatic as the user my have multiple types all mapping to vector<int>

Ideally the output should be the same as it is now for simple names.

If anyone wants they can have my swift implementation I can send it. It was only intended as a way to play around
with the output heuristic (and to be honest spent most type trying to pass the first 180 test cases).
I wsa going to put it on github but I need to create an account, and would then lose days cleaning up the code
which was only intended as a quick prototype, and not really wanting to publish until I passed all of the tests:-)

Bob

I’ve put a WIP patch up here: Sorry for the delay! Erik