[Templight] Templight "v2" with gdb-style debugger

Hi all,

I recently picked up the Templight code (see http://plc.inf.elte.hu/templight/) because I love the idea of having proper debugging and profiling tools for template meta-programming in C++. As admitted by the original authors (Zoltán Borók-Nagy, Zoltán Porkoláb and József Mihalicza), the code was rather crude, but a great first step.

– Templight “version 2” – (patch: templight_v2_clang35.diff)

So, I refactored the code to make it a bit more modular and less intrusive on the Sema class. Here is an outline of the main changes included in the attached patch (from Clang r202671):

Changes to existing Clang code-base:

(1) Moved the “ActiveTemplateInstantiation” class out of the Sema class scope, and into the clang namespace-scope. This was done to make the templight code independent from the Sema class by allowing a forward-declaration of ActiveTemplateInstantiation (can’t be done if it is nested in Sema).

(2) Minor updates to reflect the above change in scope of the ActiveTemplateInstantiation class.

(3) Created a “TemplateInstantiationObserver” base-class which bundles callback functions that are called during the template instantiations (and when leaving them). This class also supports linked-list-style chaining of additional observers.

(4) Added a OwningPtr in the Sema class to hold the chain of template instantiation observers.

(5) Added callbacks for begin / end of template instantiations. These were added in all the places where the original Templight code had templight calls. This replaces the original templight tracing calls.

(6) Added calls to initialize and finalize the template instantiation observers in the ParseAST code (see issues below).

(7) Added front-end options for Templight (mostly the same as in the original templight patch).

Changes to the Templight implementation: (unless noted here, the behavior is the same)
N.B.: The output (yaml, xml, txt) is identical to the original Templight code.

(8) All the templight code was moved out of the Sema class and into its own cpp-file. This was done mainly to avoid additional pollution in the Sema class, and to isolate the templight code’s implementation dependencies (llvm support libs, yaml output code, etc.).

(9) Created the TemplightTracer class (derived from TemplateInstantiationObserver) to expose the functionality of Templight. This class is PImpl’d.

(10) Removed the “TemplightFlag” in favor of having created the tracer or not.

(11) Removed the “-templight-capacity” option because I changed the non-safe-mode to record the instantiation traces up to when the instantiations come back to the top-level context, at which point, I dump the traces. This will not affect the profiling because at the top-level context there is no pending timing or memory to be recorded. With this change, the number of traces recorded become proportional to the maximum instantiation depth (which is limited by the compiler), therefore making the templight-capacity redundant.

(12) Changed the recording of the traces to be stored in a std::vector, as opposed to a bare dynamic array.

(13) Added the “-templight-ignore-system” option to ignore (not traced) any template instantiation coming from a system-include header file.

(14) Updated the “creation” code where front-end options are pulled and used to create the tracer.

– Templight GDB-style Debugger – (patch: templight_v2_clang35_with_debugger.diff)

Here is the really sweet part. When I realized how easy it would be to create a gdb-style debugger with the new layout that I described above, I just had to give it a go, and the result is pretty nice, albeit still crude. I welcome you to try it!

I added the option “-templight-debugger” which turns on an interactive debugging session during the template instantiations. This is implemented in the TemplightDebugger class (also PImpl’d, and also deriving from TemplateInstantiationObserver). The debugger respects the relevant templight options (memory and ignore-system) and can be used in parallel with the templight tracer (although timing values will obviously be useless).

The debugger implements most of the basic gdb commands and reproduces its behavior as much as possible. You can set breakpoints (by complete name of template instantiation), you can “step”, “next”, “run”, “kill”, etc… with the same behavior as gdb. You can print back-traces. And each time a template instantiation is entered or left, there is a structured print out similar to gdb, making it usable (I hope) in a GUI front-end similar to the various GUI front-ends to gdb.

This is really just a first draft of it, it’s quite crude. I use C-style functions for the console input to avoid pulling in the C++ iostream monster (as required in coding standards of LLVM). And, as usual with interactive programs, it’s hard to make sure that it is robust to people entering random nonsense.

Here are a few issues I’d like some feedback on:

(15) I feel that the XML output of the templight tracer could easily be nested, as opposed to having matched “Begin” and “End” blocks.

(16) I feel a bit uneasy about the place where the tracer / debugger get created, initialized and finalized. Currently, this is done in the “createSema” function and in the ParseAST code. It works this way, but it feels out-of-place. I don’t know the Clang code-base well enough (in fact, barely at all) to know where this should go. Ideally, I would also like to get rid of the initialize / finalize functions in the TemplateInstantiationObserver class, but that is impossible as long as the Sema object and most of Clang’s main structures are left to leak (always-on “-delete-free” option), as RAII is no longer available as a mechanism for handling the clean up. If anyone that understands Clang’s main structures could weight in on this, I’d be glad to hear suggestions.

(17) Is there anything in LLVM/Clang that can help in terms of doing console input? Currently, I use functions and hand-written “tokenizing”.

(18) I feel like the template debugger should be a separate entity (not an option to the main clang compiler), but at the same time, it needs the entire compilation process to be running, and so, if separate, it would have to be duplicate of clang with only small additions (in terms of code).

(19) Finally, I’d like to have ideas about what kind of diagnostic information that could be added to the debugger. Currently, it only prints out the name of the template instantiation, the location, and total memory usage. It would be nice to have features similar to GDB’s “print” commands, like being able to print the list of template arguments and their values within the current context, and things like that. I simply don’t have enough knowledge of Clang’s code to know how to tap into that kind of information.

I know this was a long email, but I hope you had the interest to read up to here, and I hope to hear back.

Cheers,
Mikael.

templight_v2_clang35.diff (50.6 KB)

templight_v2_clang35_with_debugger.diff (62.8 KB)

Hi all,

I recently picked up the Templight code (see
Templight) because I love the idea of having
proper debugging and profiling tools for template meta-programming in C++.
As admitted by the original authors (Zoltán Borók-Nagy, Zoltán Porkoláb
and József Mihalicza), the code was rather crude, but a great first step.

-- Templight "version 2" -- (patch: templight_v2_clang35.diff)

So, I refactored the code to make it a bit more modular and less intrusive
on the Sema class. Here is an outline of the main changes included in the
attached patch (from Clang r202671):

Hi Mikael,

Sorry to lead with this, but it's important. The llvm project requires that
patches be submitted by their author (or authors). If you grabbed the
original templight code and refactored it, then it may be coauthored
between you and the original authors. Our policy requires that an author
submit their own code, not mail in someone else's, see
http://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes for
specifics.

Changes to existing Clang code-base:

(1) Moved the "ActiveTemplateInstantiation" class out of the Sema class
scope, and into the clang namespace-scope. This was done to make the
templight code independent from the Sema class by allowing a
forward-declaration of ActiveTemplateInstantiation (can't be done if it is
nested in Sema).

Great! I think that ActiveTemplateInstantiation should be changed to a form
of InstantiationRecord and held from the relevant AST nodes.

  (2) Minor updates to reflect the above change in scope of

the ActiveTemplateInstantiation class.

The changes to ATI to pull it out of Sema and make it part of the AST are
complicated enough, I think you should send that out as a first patch. It
may not be much code, but for review we'll want to make sure that the
representation is accurate to what the standard says, efficient in memory
usage and has fast accessors.

There's also template instantiation for different reasons which are not
currently recorded (ie., due to overload resolution, due to virtual
methods). Since that's a change to ATI, that would logically go next.

(3) Created a "TemplateInstantiationObserver" base-class which bundles

callback functions that are called during the template instantiations (and
when leaving them). This class also supports linked-list-style chaining of
additional observers.

I think we usually use "Callbacks" instead of "Observer", similar to
PPCallbacks?

  (4) Added a OwningPtr<TemplateInstantiationObserver> in the Sema class to

hold the chain of template instantiation observers.

(5) Added callbacks for begin / end of template instantiations. These
were added in all the places where the original Templight code had
templight calls. This replaces the original templight tracing calls.

At a high level, I think that what we currently do in Sema by tracking
template instantiations as a stack (Sema::ActiveTemplateInstantiations)
should be available in the AST after instantiation is complete. Then we
could query the AST to ask why something was instantiated instead of
tracing the act of instantiation. Is there any use case for templight where
this wouldn't work?

Nick

(6) Added calls to initialize and finalize the template instantiation

Hi all,

First of all, Nick: many thanks for helping us. Mikael did a huge work refactoring templight and now there are even clients eager to use it:
https://github.com/sabel83/metashell

Hi Mikael,

Sorry to lead with this, but it's important. The llvm project requires that
patches be submitted by their author (or authors). If you grabbed the
original templight code and refactored it, then it may be coauthored between
you and the original authors. Our policy requires that an author submit
their own code, not mail in someone else's,
see http://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes for
specifics.

We completely support Mikael's work on refactoring the original templight. Naturally, we happy to adopt Mikael as a co-author or give him any kind of
authorization to submit this patch.

      Changes to existing Clang code-base:

[...]

(5) Added callbacks for begin / end of template instantiations. These
were added in all the places where the original Templight code had
templight calls. This replaces the original templight tracing calls.

At a high level, I think that what we currently do in Sema by tracking
template instantiations as a stack (Sema::ActiveTemplateInstantiations)
should be available in the AST after instantiation is complete. Then we
could query the AST to ask why something was instantiated instead of tracing
the act of instantiation. Is there any use case for templight where this
wouldn't work?

Don't forget, this is a template debugger and profiler function.

As a debugger we want to follow the chain of template related events even if the instantiation was not successfully finished. Unpaired begin-end instantiation events may happen and are reported in templight. Moreover, this strategy allows templight to work in "safe mode": emitting events immediatelly as they happened. This way we have the last events even if the compiler crashes - which unfortunatelly may happen for template metaprograms.

As a profiler templight assigns timestamps for beginning/end events to measure the time clang has spent on that instantiation.

Thanks again helping us,

Zoltan

Hi Nick,

Glad to finally get a sign of life from the clang dev team!

The llvm project requires that patches be submitted by their author (or

authors). If you grabbed the original templight code and refactored it,
then it may be coauthored between you and the original authors. Our policy
requires that an author submit their own code, not mail in someone else's

The patches from this email are very out-dated (and I was completely
unfamiliar with the process of patch submissions... well, I still am to be
honest :wink: ). I submitted a much smaller patch after successfully managing
to extract all the templight code out of the clang source, to make it a
separate tool instead, leaving only very minor "hooks" to be added to Clang
itself. That patch can be found at:

http://reviews.llvm.org/D5767

As for the authorship, there are a few things to notice.

First of all, in the new patch that I submitted, all of the code is my own
code, because all of the original templight code has been essentially
factored out from the original patch, i.e., the "real" templight code now
sits in a separate project at:

https://github.com/mikael-s-persson/templight

There is also a more up-to-date clang patch on there, which I will
re-submit eventually to Clang when there is more sign of life from the
community (I don't like wasting my valuable time throwing patch-updates
into a bottomless pit).

The original patch also lacked any specific authorship notice and
licensing. But I'm more than happy to add any co-author that is needed to
satisfy your submission requirements. All I want is the patch to go
through, and I think all other interested parties and original authors feel
the same. And after all, we are only talking about of few dozen trivial
lines of code here.

And for the record, I'm a bit annoyed by the "mail in someone else's code",
as it kind of belittles the massive refactoring work I did on the original
patch, and the fact that the majority of what it is now is my own code.

I think that ActiveTemplateInstantiation should be changed to a form of

InstantiationRecord and held from the relevant AST nodes.

Maybe so, I don't know, and frankly I don't care. Well, I do care for
having good template instantiation code in Clang, but I cannot afford to
care too much about design decisions that are not mine to take.

The changes to ATI to pull it out of Sema and make it part of the AST are

complicated enough, I think you should send that out as a first patch.

I think there is a confusion here. I never pulled the ATI into the AST, I
only pulled it out of the Sema class. It has literally just gone from being
a nested class within Sema to being a class at global scope (clang
namespace, of course!), that's all. There is no change whatsoever in the
code except for a change in scope and header-file placement (to avoid
including the Sema header, and just because nested classes are a bad idea
(I come from Boost library development, the design guidelines are more
strict over there)).

If you want to work on making ATIs a part of the AST, you're welcome to try
/ propose it. It is none of my concern, and in fact I would object to it
(see below).

There's also template instantiation for different reasons which are not

currently recorded (ie., due to overload resolution, due to virtual
methods). Since that's a change to ATI, that would logically go next.

You're right, the categories of template instantiations are too broad. That
is something I would be interested in discussing and intend to bring up
after my patch has gone through. The cases of overload resolution and
virtual methods are not exactly the first ones that come to mind for me
though. The additional categories I was going to propose are for
differentiating class template instantiations, function declaration
instantiations and function definition instantiations. The lack of
categorization of those things currently cause templight traces to have
ambiguous entries, which I would like to avoid.

I think we usually use "Callbacks" instead of "Observer", similar to

PPCallbacks?

Fine. When we get closer to getting this (D5767) patch through, that can
easily be changed, no problem.

At a high level, I think that what we currently do in Sema by tracking

template instantiations as a stack should be available in the AST after
instantiation is complete.

I'm not sure if this makes sense in terms of the AST, after all, the stack
of template instantiations are compilation steps, not actual entities in
the code. My understanding is that the AST is only there to record all the
entities in the code. I see this as a case of semantic pollution. What
other records of compilation steps are recorded in the AST?

Also, if the AST is to keep a record of all the template instantiations
that occurred, this could significantly increase memory consumption (and
thus, speed as well). It is a good thing that the compiler throws away this
information because it would and could amount to a lot if it kept it all (I
would say approximately 5-10% increase in memory consumption, from my
experience). This would be a lot of passive overhead (and therefore, would
have to be disabled by default, of course). I certainly would not want such
overhead by default in any compiler that I use (and I'm not sure that
people who use libclang would want this either).

Then we could query the AST to ask why something was instantiated instead

of tracing the act of instantiation. Is there any use case for templight
where this wouldn't work?

Yes, pretty much all of templight as a profiler would not work that way,
unless the AST also held records of time- and memory- consumption during
those instantiation steps. The AST is useful to the templight debugger and
is also used in conjunction with the templight traces by the metashell
utilities (that Zoltan linked to). The AST describes the code, the
templight traces describe the compilation process, they are entirely
orthogonal features (yet complementary), as far as I'm concerned.

To me, it is important to stress that, although the debugging and
interactive querying of the AST is a neat feature, the real
"production-critical" feature here is the profiling. This is my primary
interest, and those of my peers, i.e., the people who write production
template-heavy code that need to optimize the compilation time and memory.
Focusing too much on the debugging and AST querying aspects could impede
that primary goal, as far as I'm concerned.

And as Zoltan mentioned, there could be issues with the safe-mode and
dealing with failures to instantiated the templates if we are going to rely
on "successful" AST nodes to describe the compilation process.

There are basically three use-case categories for templight (from most
"novice" to most "professional"): learning, debugging and profiling.
Learning and debugging imply that things might be failing a lot, and
profiling imply that there is no need to understand the process but a need
to identify performance bottlenecks. Your approach does not serve any of
these domains particularly well.

Thanks,
Mikael.

Hi all,

First of all, Nick: many thanks for helping us. Mikael did a huge work
refactoring templight and now there are even clients eager to use it:
https://github.com/sabel83/metashell

Hi Mikael,

Sorry to lead with this, but it's important. The llvm project requires
that
patches be submitted by their author (or authors). If you grabbed the
original templight code and refactored it, then it may be coauthored
between
you and the original authors. Our policy requires that an author submit
their own code, not mail in someone else's,
see http://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes for
specifics.

We completely support Mikael's work on refactoring the original templight.
Naturally, we happy to adopt Mikael as a co-author or give him any kind of
authorization to submit this patch.

That is good to hear!

      Changes to existing Clang code-base:

[...]

  (5) Added callbacks for begin / end of template instantiations. These

were added in all the places where the original Templight code had
templight calls. This replaces the original templight tracing calls.

At a high level, I think that what we currently do in Sema by tracking
template instantiations as a stack (Sema::ActiveTemplateInstantiations)
should be available in the AST after instantiation is complete. Then we
could query the AST to ask why something was instantiated instead of
tracing
the act of instantiation. Is there any use case for templight where this
wouldn't work?

Don't forget, this is a template debugger and profiler function.

Got it. I did forget about profiling.

As a debugger we want to follow the chain of template related events even

if the instantiation was not successfully finished. Unpaired begin-end
instantiation events may happen and are reported in templight.

I'm not sure what you're describing. Do these ever occur in a way that is
not a bug in clang?

Moreover, this strategy allows templight to work in "safe mode": emitting

events immediatelly as they happened. This way we have the last events even
if the compiler crashes - which unfortunatelly may happen for template
metaprograms.

As a profiler templight assigns timestamps for beginning/end events to
measure the time clang has spent on that instantiation.

I think a template debugger would be a great new major feature for clang,
and I've heard requests for it in person. However, it seems that different
people have different ideas of what exactly "template debugger" means. I
don't know of any effort which is farther along than templight, which
suggests that you have real uses for template debugging rather than
hypothetical users.

Nick

Hi Nick,

Glad to finally get a sign of life from the clang dev team!

I'm more on the LLVM side. While I'm willing to make some changes to clang
I'll sometimes conclude that I don't know an area of clang well enough to
approve a patch, on a case by case basis.

The llvm project requires that patches be submitted by their author (or
authors). If you grabbed the original templight code and refactored it,
then it may be coauthored between you and the original authors. Our policy
requires that an author submit their own code, not mail in someone else's

The patches from this email are very out-dated (and I was completely
unfamiliar with the process of patch submissions... well, I still am to be
honest :wink: ). I submitted a much smaller patch after successfully managing
to extract all the templight code out of the clang source, to make it a
separate tool instead, leaving only very minor "hooks" to be added to Clang
itself. That patch can be found at:

http://reviews.llvm.org/D5767

As for the authorship, there are a few things to notice.

First of all, in the new patch that I submitted, all of the code is my own
code, because all of the original templight code has been essentially
factored out from the original patch, i.e., the "real" templight code now
sits in a separate project at:

GitHub - mikael-s-persson/templight: Templight is a Clang-based tool to profile the time and memory consumption of template instantiations and to perform interactive debugging sessions to gain introspection into the template instantiation process.

There is also a more up-to-date clang patch on there, which I will
re-submit eventually to Clang when there is more sign of life from the
community (I don't like wasting my valuable time throwing patch-updates
into a bottomless pit).

The original patch also lacked any specific authorship notice and
licensing. But I'm more than happy to add any co-author that is needed to
satisfy your submission requirements. All I want is the patch to go
through, and I think all other interested parties and original authors feel
the same. And after all, we are only talking about of few dozen trivial
lines of code here.

And for the record, I'm a bit annoyed by the "mail in someone else's
code", as it kind of belittles the massive refactoring work I did on the
original patch, and the fact that the majority of what it is now is my own
code.

My sincere apologies, I did not intend to belittle. I was only summarizing
what the policy says, not making any claim about the work you have done. If
you tell me that the code meets the requirements in the developer's policy,
that is enough for me.

I think that ActiveTemplateInstantiation should be changed to a form of
InstantiationRecord and held from the relevant AST nodes.

Maybe so, I don't know, and frankly I don't care. Well, I do care for
having good template instantiation code in Clang, but I cannot afford to
care too much about design decisions that are not mine to take.

Ok. It matters to me due to code refactoring tools.

The changes to ATI to pull it out of Sema and make it part of the AST are
complicated enough, I think you should send that out as a first patch.

I think there is a confusion here. I never pulled the ATI into the AST, I
only pulled it out of the Sema class. It has literally just gone from being
a nested class within Sema to being a class at global scope (clang
namespace, of course!), that's all. There is no change whatsoever in the
code except for a change in scope and header-file placement (to avoid
including the Sema header, and just because nested classes are a bad idea
(I come from Boost library development, the design guidelines are more
strict over there)).

My mistake, I was thinking that you pulled it out of Sema-the-library, not
Sema-the-class. In particular, it still uses sema::TemplateDeductionInfo.

If you want to work on making ATIs a part of the AST, you're welcome to try

/ propose it. It is none of my concern, and in fact I would object to it
(see below).

> There's also template instantiation for different reasons which are not
currently recorded (ie., due to overload resolution, due to virtual
methods). Since that's a change to ATI, that would logically go next.

You're right, the categories of template instantiations are too broad.
That is something I would be interested in discussing and intend to bring
up after my patch has gone through. The cases of overload resolution and
virtual methods are not exactly the first ones that come to mind for me
though. The additional categories I was going to propose are for
differentiating class template instantiations, function declaration
instantiations and function definition instantiations. The lack of
categorization of those things currently cause templight traces to have
ambiguous entries, which I would like to avoid.

> I think we usually use "Callbacks" instead of "Observer", similar to
PPCallbacks?

Fine. When we get closer to getting this (D5767) patch through, that can
easily be changed, no problem.

> At a high level, I think that what we currently do in Sema by tracking
template instantiations as a stack should be available in the AST after
instantiation is complete.

I'm not sure if this makes sense in terms of the AST, after all, the stack
of template instantiations are compilation steps, not actual entities in
the code. My understanding is that the AST is only there to record all the
entities in the code. I see this as a case of semantic pollution. What
other records of compilation steps are recorded in the AST?

Implicit casts, for one. Expressions in the C++ standard have associated
grammar therefore implicit casts are not expressions, even though we call
them ImplicitCastExpr in clang. We could remove them and have the consumers
of the AST imagine them as needed, saving memory and possibly compilation
time. Or possibly costing compile time, if removing them would make Clang's
own code more complicated or force us to check more conditions and
recompute properties more often.

Also, if the AST is to keep a record of all the template instantiations

that occurred, this could significantly increase memory consumption (and
thus, speed as well).

Yes. I am also concerned about this. We had the same concern about keeping
around source locations and macro expansions, but it turns out that clang
was able to make it work efficiently. Similar efficiency would be a
requirement, and no, I don't know how to do it. I'm assuming we would
figure it out.

It is a good thing that the compiler throws away this information because

it would and could amount to a lot if it kept it all (I would say
approximately 5-10% increase in memory consumption, from my experience).
This would be a lot of passive overhead (and therefore, would have to be
disabled by default, of course). I certainly would not want such overhead
by default in any compiler that I use (and I'm not sure that people who use
libclang would want this either).

Part of the question is whether we could use it to simplify code in Sema
itself. I don't think anyone wants it to be an optional feature, it should
either be efficient enough to be always-on, or not there at all.

Then we could query the AST to ask why something was instantiated instead
of tracing the act of instantiation. Is there any use case for templight
where this wouldn't work?

Yes, pretty much all of templight as a profiler would not work that way,
unless the AST also held records of time- and memory- consumption during
those instantiation steps. The AST is useful to the templight debugger and
is also used in conjunction with the templight traces by the metashell
utilities (that Zoltan linked to). The AST describes the code, the
templight traces describe the compilation process, they are entirely
orthogonal features (yet complementary), as far as I'm concerned.

To me, it is important to stress that, although the debugging and
interactive querying of the AST is a neat feature, the real
"production-critical" feature here is the profiling. This is my primary
interest, and those of my peers, i.e., the people who write production
template-heavy code that need to optimize the compilation time and memory.
Focusing too much on the debugging and AST querying aspects could impede
that primary goal, as far as I'm concerned.

And as Zoltan mentioned, there could be issues with the safe-mode and
dealing with failures to instantiated the templates if we are going to rely
on "successful" AST nodes to describe the compilation process.

There are basically three use-case categories for templight (from most
"novice" to most "professional"): learning, debugging and profiling.
Learning and debugging imply that things might be failing a lot, and
profiling imply that there is no need to understand the process but a need
to identify performance bottlenecks. Your approach does not serve any of
these domains particularly well.

This is really enlightening, thanks.

I'm convinced that my idea of moving template instantiation information
into the AST is orthogonal to the needs of templight. Also, I think the
observer you want to add is a fine idea and we should add that.

Nick