Switching terminology from 'instantiation' to 'expansion' for macros?

This is something that came up some time ago on IRC, and several (myself, John McCall, and maybe others) really liked the idea of: systematically switch from ‘instantiation’ to ‘expansion’ in terminology relating to macros and the preprocessor. (Is there a better word than ‘expand’? I couldn’t come up with one. The following argument should be independent of what wording is actually chosen, except that it be different from ‘instantiation’.)

The reasoning is two fold:

  1. It more accurately describes the underlying process (token expansion), including the fact that the process happens on each use of a macro, not just on the first with a particular set of arguments, etc.

  2. It helps users (and likely developers) distinguish between diagnostics and systems relating to macros vs. templates.

I think the second point is perhaps the most important here. As a bit of an extreme case, consider the following code:

% cat t5.cc
struct S1;
struct S2;
template struct X1 { T value; };
template struct X2 { X1 value; };
#define M0 X2().value;
#define M1(x, y, z) X2 foo = y
#define M2(x, y, z) M1(x, y, z)
M2(1, M0, 3);

% ./bin/clang -fsyntax-only t5.cc
t5.cc:3:37: error: field has incomplete type ‘S1’
template struct X1 { T value; };
^
t5.cc:4:41: note: in instantiation of template class ‘X1’ requested here
template struct X2 { X1 value; };
^
t5.cc:8:7: note: in instantiation of template class ‘X2’ requested here
M2(1, M0, 3);
^
t5.cc:5:12: note: instantiated from:
#define M0 X2().value;
^
t5.cc:7:27: note: instantiated from:
#define M2(x, y, z) M1(x, y, z)
^
t5.cc:6:34: note: instantiated from:
#define M1(x, y, z) X2 foo = y
^

I think it’s very nice to have different wording for the macro back traces. As a bit of a straw-man, I’ve just replaced ‘instantiated’ with ‘expanded’ and I already find the result somewhat better:

% ./bin/clang -fsyntax-only t5.cc
t5.cc:3:37: error: field has incomplete type ‘S1’
template struct X1 { T value; };
^
t5.cc:4:41: note: in instantiation of template class ‘X1’ requested here
template struct X2 { X1 value; };
^
t5.cc:8:7: note: in instantiation of template class ‘X2’ requested here
M2(1, M0, 3);
^
t5.cc:5:12: note: expanded from:
#define M0 X2().value;
^
t5.cc:7:27: note: expanded from:
#define M2(x, y, z) M1(x, y, z)
^
t5.cc:6:34: note: expanded from:
#define M1(x, y, z) X2 foo = y
^

I’d like to slowly work to move the note text to be instead:

% ./bin/clang -fsyntax-only t5.cc
t5.cc:3:37: error: field has incomplete type ‘S1’
template struct X1 { T value; };
^
t5.cc:4:41: note: in instantiation of template class ‘X1’ requested here
template struct X2 { X1 value; };
^
t5.cc:8:7: note: in instantiation of template class ‘X2’ requested here
M2(1, M0, 3);
^
t5.cc:5:12: note: ‘M0’ expanded from the macro defined here:
#define M0 X2().value;
^
t5.cc:7:27: note: ‘M2(…)’ expanded from the macro defined here:
#define M2(x, y, z) M1(x, y, z)
^
t5.cc:6:34: note: ‘M1(…)’ expanded from the macro defined here:
#define M1(x, y, z) X2 foo = y
^

What do folks think? Would it be OK to work toward this in increments, starting with ‘expanded from the macro defined here’, and then working on a patch to add the name of the macro that we’re showing the definition for?

As a related, but not necessary additional step, how do folks feel about moving the APIs, comments, etc inside of Clang’s code itself to use the same nomenclature? I’d really like to see this as it would make reading these parts of the codebase easier in the same way I think it makes reading the diagnostics above easier. Thoughts?

I like all of it. I do have a question: is the term "expansion" also used in another case involving templates, like parameter packs? Should we try to use a third term for macros in that case, like "substitution" (which is also used for templates (think "SFINAE"); maybe there is no winning this battle)?

- John

Thoughts?

I like all of it. I do have a question: is the term “expansion” also
used in another case involving templates, like parameter packs?

Oof, yea, it is.

On the good side, the standard uses ‘expansion’ in that context almost exclusively as ‘pack expansion’, a term defined in the standard. I suspect any use we made of it in diagnostics could also be profitably tied to the word ‘pack’.

Also, both the C and C++ standards refer to ‘macro expansion’ to mean this.

Should
we try to use a third term for macros in that case, like “substitution”
(which is also used for templates (think “SFINAE”);

Good idea, the C standard uses ‘substitution’ to describe the process through which arguments to function-style macros are substituted into the macro body (C99 6.10.3.1). We should at the least try to use ‘substitution’ in the code surrounding this process in Clang IMO.

However, I don’t think there are many contexts where we need or want to distinguish between macro expansion and argument substitution, but i’d not be opposed to using those two compound terms if that situation arises.

maybe there is no
winning this battle)?

I’m fairly certain there is no perfect word. ;] That said, I like that both macro expansion and argument substitution are mentioned in the standard.

The only other term I’ve thought about is ‘replacement’ as that’s what the standard uses in most cases. However, for diagnostics and thinking about how the preprocessor works, I actually find ‘expansion’ much more helpful. However, if others like the C standard’s ‘replacement’ terminology, I’d be down with that. =]

I like 'expansion'. Specifically, I like "expanded from macro here", or something like that. I wouldn't try to call out specific cases of where the text came from; there's a lot of value in having a consistent message for these, because users can recognize the shape of a macro-expansion note at a glance without having to actually read it.

It is not worth trying to avoid the name collision with variadic template parameter packs.

Thanks for doing this!

John.

John and I chatted on IRC about various wordings for these notes. He was particularly concerned about keeping them short and concise, which I agree with, and hopefully the macro name itself won’t negatively impact that too much. Based on your suggestion my two candidate wordings would be:

“note: expanded from macro ‘foo’:”

and

“note: expanded from macro ‘foo’ defined here:”

The latter is a bit wordy, but helps give context to the snippet that follows, and has good structural similarity to the template message:
“note: in instantiation of template class ‘foo’ requested here:”

One concern with just removing “defined” is the ambiguity of “here” referring to either where the expansion occurs or the definition. Without the “here”, the snippet would hopefully associate with the “macro ‘foo’”, but it’s still a touch vague…

More thoughts on wording welcome!

Agreed on all points.

  - Doug

+1

-Chris

Thoughts on my second question (that of the names used in the actual code of Clang)? I’ll start on fixing the diagnostics based on this plan today.

Agreed on all points.

Thoughts on my second question (that of the names used in the actual code of Clang)? I’ll start on fixing the diagnostics based on this plan today.

I like the whole proposal, and having the name change reflected in clang source code would be ideal, thanks for your work!

Cool, here is the first patch which just changes the word in the diagnostics (none of the fancy new diagnostics I’d like to get to yet…) and the first 4 patches in moving the code over. This moves essentially all of the code local to the preprocessor and the diagnostic printer over. How is this looking?

Can I rely on post-commit review for the rest of the terminology switch? That would be really useful as several of those are going to be massive, but entirely mechanical patches (SourceManager cleanup… ugh…).

Do we want to add the new terms to libclang’s C interface? (they’re currently only used for the names of enums, so we should be able to leave the old names in with the same value for compatibility)

Finally, the change to introduce the more detailed note diagnostic text will require a bit of re-working the infrastructure in TextDiagnosticPrinter. I’d like to generally refactor how some of that file is implemented. Can I just commit those refactorings, and then the changes to implement the new functionality, or would you like pre-commit review there? (I’m fine either way, just trying to plan out my work.)

Thanks!

0001-switch-diagnostic-messages.patch (8.4 KB)

0002-switch-test-diagnostic-printer.patch (5.04 KB)

0003-switch-token-lexer.patch (9.56 KB)

0004-switch-lexer.patch (13.2 KB)

0005-switch-lex.patch (23.9 KB)

Yes, it makes sense to update the names used in Clang’s code as well as the diagnostics. Thanks for doing this!

  • Doug

Agreed on all points.

Thoughts on my second question (that of the names used in the actual code of Clang)? I’ll start on fixing the diagnostics based on this plan today.

I like the whole proposal, and having the name change reflected in clang source code would be ideal, thanks for your work!

Cool, here is the first patch which just changes the word in the diagnostics (none of the fancy new diagnostics I’d like to get to yet…) and the first 4 patches in moving the code over. This moves essentially all of the code local to the preprocessor and the diagnostic printer over. How is this looking?

They all look good to me.

Can I rely on post-commit review for the rest of the terminology switch? That would be really useful as several of those are going to be massive, but entirely mechanical patches (SourceManager cleanup… ugh…).

Yes, post-commit review is fine.

Do we want to add the new terms to libclang’s C interface? (they’re currently only used for the names of enums, so we should be able to leave the old names in with the same value for compatibility)

Yes, it makes sense to add these new terms to libclang’s C interface, so long as we keep the old ones in place.

Finally, the change to introduce the more detailed note diagnostic text will require a bit of re-working the infrastructure in TextDiagnosticPrinter. I’d like to generally refactor how some of that file is implemented. Can I just commit those refactorings, and then the changes to implement the new functionality, or would you like pre-commit review there? (I’m fine either way, just trying to plan out my work.)

Refactoring in TextDiagnosticPrinter seems like something that can be post-commit reviewed.

  • Doug