ASTMatchers.h and its internal linkage variable definitions

Hey Manuel,

In an effort to apply Modular Code Generation to Clang internally, I came across the fact that this header contains a bunch of internal linkage variables. (now, maybe modular code generation should be compatible with internal linkage namespace-scoped variables… I’m looking into that)

These variables technically create ODR violations any time they’re ODR-used in another inline function used in more than one translation unit.

Is there a good way we could fix this header so it doesn’t create ODR violations (hopefully by not containing internal linkage entities)? I mean one simple solution is to declare all these things in a header and define them out of line. Since they have no actual state, that’s probably pretty low-cost except for a lot of duplication. I’m happy to use a .def/.inc file to help remove a bunch of that duplication, if useful.

Open to ideas,

  • Dave

Hey Manuel,

In an effort to apply Modular Code Generation to Clang internally, I came across the fact that this header contains a bunch of internal linkage variables. (now, maybe modular code generation should be compatible with internal linkage namespace-scoped variables… I’m looking into that)

These variables technically create ODR violations any time they’re ODR-used in another inline function used in more than one translation unit.

Is there a good way we could fix this header so it doesn’t create ODR violations (hopefully by not containing internal linkage entities)? I mean one simple solution is to declare all these things in a header and define them out of line. Since they have no actual state, that’s probably pretty low-cost except for a lot of duplication. I’m happy to use a .def/.inc file to help remove a bunch of that duplication, if useful.

I remember having that argument a long time ago with somebody (perhaps Sam (cc’ed)?)

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen. (specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

If the namespace-scoped static variable is unreferenced from inline functions, that’s fine - such as the iostreams global initializer.

I’ll send a CR to move these definitions out of line.

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

If the namespace-scoped static variable is unreferenced from inline functions, that’s fine - such as the iostreams global initializer.

I’ll send a CR to move these definitions out of line.

I’m happy with that, too. I assume once c++17 hits the shelves we’ll be able to replace it anyway?

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();
};
const foo f; // this has internal linkage
const foo &x() { return f; }

if ‘x’ appears in more than one TU, it’s an ODR violation because ‘f’ resolves to different entities in each TU so there are multiple distinct definitions of ‘x’.

In this case, for example, ‘f’ is ‘callExpr’ (ASTMatchers.h:1138) and ‘x’ is the AST_POLYMORPHIC_MATCHER_P2 on ASTMatchers.h:3491 that uses ‘callExpr’ in an inline function in the header.

If the namespace-scoped static variable is unreferenced from inline functions, that’s fine - such as the iostreams global initializer.

I’ll send a CR to move these definitions out of line.

I’m happy with that, too. I assume once c++17 hits the shelves we’ll be able to replace it anyway?

With inline variables? Yeah, I think so. Though I should test/figure out whether/how modular codegen will work with inline variables…

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();
};
const foo f; // this has internal linkage
const foo &x() { return f; }

Where do we return matcher reference types?
This should just return foo by value?

if ‘x’ appears in more than one TU, it’s an ODR violation because ‘f’ resolves to different entities in each TU so there are multiple distinct definitions of ‘x’.

In this case, for example, ‘f’ is ‘callExpr’ (ASTMatchers.h:1138) and ‘x’ is the AST_POLYMORPHIC_MATCHER_P2 on ASTMatchers.h:3491 that uses ‘callExpr’ in an inline function in the header.

That just calls callExpr() though, and never ODR uses it?

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();

Ah, here’s the detail I missed: it actually boils down to
struct foo {
ExpressionTemplateMatcherType operator()(…);
}
The only use of foo is by calling the operator on it.

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();

Ah, here’s the detail I missed: it actually boils down to
struct foo {
ExpressionTemplateMatcherType operator()(…);
}
The only use of foo is by calling the operator on it.

Right, yep yep (sorry I skipped over a few too many steps in my reduction/example). Calling a non-static member function is an ODR use, roughly equivalent to passing the value by reference to another function (much the same as returning by reference).

  • Dave

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();

Ah, here’s the detail I missed: it actually boils down to
struct foo {
ExpressionTemplateMatcherType operator()(…);
}
The only use of foo is by calling the operator on it.

Right, yep yep (sorry I skipped over a few too many steps in my reduction/example). Calling a non-static member function is an ODR use, roughly equivalent to passing the value by reference to another function (much the same as returning by reference).

Even if the function never touches the reference? Bummer.

Ping?

Richard & I decided that supporting this particular case of namespace-scope static variables in modular headers using modular codegen.

Part of the sentence is missing?

Huh, right you are…

What I was going to say (but lost track through all the caveats/criteria): Richard & I decided that is invalid (it’s a legit ODR violation) & doesn’t need to be supported.

(though there may still be issues with the non-ODR violation - such as the use of internal linkage namespace-scope reference variables… :confused: )

(specifically the case where the namespace-scope static variable is referenced from an inline function - thus making a true ODR violation (in classic C++ this’d be an ODR violation if more than one TU used that header - so it seems OK to not support this in modular codegen))

Calling is not an odr use, right? And I thought referencing alone doesn’t lead to an odr violation?

This is what the ASTMatcher code boils down to:

struct foo {
foo();

Ah, here’s the detail I missed: it actually boils down to
struct foo {
ExpressionTemplateMatcherType operator()(…);
}
The only use of foo is by calling the operator on it.

Right, yep yep (sorry I skipped over a few too many steps in my reduction/example). Calling a non-static member function is an ODR use, roughly equivalent to passing the value by reference to another function (much the same as returning by reference).

Even if the function never touches the reference? Bummer.

nod

Committed r318304 (and tested that this is the final change needed to successfully link a modular code generation enabled build of the clang binary - which is awesome :slight_smile: ). Happy to iterate on exactly how that’s fixed, refactor things into macros, etc. (I did introduce an alias template to help simplify some things - but happy if a more (consistent with existing code) macro solution would be preferred, etc)

Thanks!

  • Dave