[RFC] C++17 hardware constructive / destructive interference size

jyknight · May 25, 2018, 9:23pm

My own employer doesn’t make ABI stability promises for that code, and thus is fine with changing the value anytime it feels like. That’s not a generically viable strategy for a value provided by the standard library.

Additionally, before I sent that email, I looked at a number of the uses, and it appeared as though a great many could be easily modified to use a runtime-determined alignment.

JF_Bastien2 · May 25, 2018, 9:30pm

My own employer doesn’t make ABI stability promises for that code, and thus is fine with changing the value anytime it feels like. That’s not a generically viable strategy for a value provided by the standard library.

Additionally, before I sent that email, I looked at a number of the uses, and it appeared as though a great many could be easily modified to use a runtime-determined alignment.

That would be useful feedback on the paper… prior to it getting into C++17. The committee’s POV voting the paper in was that having a constexpr value was something we wanted, and so that’s what we have. At this point in time I’d like to focus on implementing C++17 as it is, and / or filing DRs as required.

jyknight · May 25, 2018, 10:12pm

Sure. I’m not on the committee. Even if I was, I certainly don’t know that I would have identified the problem…

But now that it has been identified, there’s a choice of what to do. And not implementing the function (and presumably filing a DR saying so) is seeming like a pretty reasonable option.

JF_Bastien2 · May 29, 2018, 8:39pm

The committee discussed ABI issues (Jacksonville 2016) and decided that they’d rather have them than have a proliferation of #define SOMETHING 64. That discussion occurred with Google folks in the room, it might be higher bandwidth to consult with them? The notes are unfortunately quite sparse for that discussion.

The libc++ community shouldn’t decline to implement a feature without bringing concrete feedback to the committee. Without such feedback, I’d like to move forward with an implementation plan, because we should offer full C++17 support regardless of our distaste for specific features. I’ve received good feedback on the thread so far, I’m happy to leave the discussion open for a bit, talk to committee people next week in Rapperswil, and unless feedback goes the committee’s way I’d like to pursue an implementation. Does this sound fair?

John_McCall1 · May 29, 2018, 9:27pm

My own employer doesn’t make ABI stability promises for that code, and thus is fine with changing the value anytime it feels like. That’s not a generically viable strategy for a value provided by the standard library.

Additionally, before I sent that email, I looked at a number of the uses, and it appeared as though a great many could be easily modified to use a runtime-determined alignment.

That would be useful feedback on the paper… prior to it getting into C++17. The committee’s POV voting the paper in was that having a constexpr value was something we wanted, and so that’s what we have. At this point in time I’d like to focus on implementing C++17 as it is, and / or filing DRs as required.

Sure. I’m not on the committee. Even if I was, I certainly don’t know that I would have identified the problem…

But now that it has been identified, there’s a choice of what to do. And not implementing the function (and presumably filing a DR saying so) is seeming like a pretty reasonable option.

The committee discussed ABI issues (Jacksonville 2016) and decided that they’d rather have them than have a proliferation of #define SOMETHING 64. That discussion occurred with Google folks in the room, it might be higher bandwidth to consult with them? The notes are unfortunately quite sparse for that discussion.

If the committee is okay with the possibility of otherwise-ABI-compatible target differences causing ODR violations, that seems like something they should be very explicit about.

The libc++ community shouldn’t decline to implement a feature without bringing concrete feedback to the committee. Without such feedback, I’d like to move forward with an implementation plan, because we should offer full C++17 support regardless of our distaste for specific features.

As a general goal I agree, but I think we have to acknowledge that sometimes changes will move through the committee process and only later be recognized as problematic. I don’t think this rises to the level of “export template”-style intransigence, but still, if we’re not sure how to implement what the committee has standardized, we should get guidance from the committee instead of implementing something just to tick a checkbox.

Separate question that I don’t think has been covered here: who actually owns the values of these constants? Is it really up to the C++ standard library, or is this really part of the platform ABI that’s just being exposed in a special way? That is, do different C++ standard libraries on the same target have a responsibility to ensure that they use the same value?

John.

JF_Bastien2 · May 30, 2018, 6:10pm

My own employer doesn’t make ABI stability promises for that code, and thus is fine with changing the value anytime it feels like. That’s not a generically viable strategy for a value provided by the standard library.

Additionally, before I sent that email, I looked at a number of the uses, and it appeared as though a great many could be easily modified to use a runtime-determined alignment.

That would be useful feedback on the paper… prior to it getting into C++17. The committee’s POV voting the paper in was that having a constexpr value was something we wanted, and so that’s what we have. At this point in time I’d like to focus on implementing C++17 as it is, and / or filing DRs as required.

Sure. I’m not on the committee. Even if I was, I certainly don’t know that I would have identified the problem…

But now that it has been identified, there’s a choice of what to do. And not implementing the function (and presumably filing a DR saying so) is seeming like a pretty reasonable option.

The committee discussed ABI issues (Jacksonville 2016) and decided that they’d rather have them than have a proliferation of #define SOMETHING 64. That discussion occurred with Google folks in the room, it might be higher bandwidth to consult with them? The notes are unfortunately quite sparse for that discussion.

If the committee is okay with the possibility of otherwise-ABI-compatible target differences causing ODR violations, that seems like something they should be very explicit about.

That’s fair. It was discussed but even the notes don’t reflect that outcome. Since the meeting is next week I’ll bring it up with SG1 and other groups, and circle back here.

My recollection is that the committee wants to do away with the constants everyone has and have the compiler provide them. There’s understanding that an ABI issue exists, but the value is otherwise usable in so many non-ABI places that the usefulness outweighs the one pain point. It’s far from the only thing that can affect ABI!

jyknight · May 30, 2018, 7:38pm

There’s been 3 options discussed so far – I’m not sure which (#1 or #2) you’re now proposing to implement.

Return an subtarget-dependent value, depending on the exact CPU model selected at compile-time.
Good: Allows for better memory-usage/performance.
Bad: Potential risk of ODR violations/ABI issues, due to dependency on cpu tuning flags.
Bad: Potential risk of same across versions of the compiler, if the default generic cpu tuning is changed.
Choose a single “good enough” constant value for each platform.
Good: eliminate ABI/ODR issues.
Bad: value might be too conservative for users’ desires.
e.g. returning 128 for hardware_destructive_interference_size when 64 would’ve been sufficient will waste memory.

Bad: Future CPU changes might invalidate the constant generic value, requiring either that it be changed (introducing an ABI issue again), or remain incorrect forever.
e.g. most ARM chips have had 64-byte cache-lines for a while now, so that would’ve seemed the only reasonable number to choose on ARM up until recently. But, now, some of the newest CPUs have apparently switched to 128-byte cache-lines; should we change to 128?

(Or, 2b: YOLO, 64 bytes should be good enough for all platforms!)

Decline to implement at all.
Good: avoid these issues.
Bad: users who need it must do something themselves, e.g. choose some arbitrary value e.g. 64.

JF_Bastien2 · May 30, 2018, 7:48pm

My own employer doesn’t make ABI stability promises for that code, and thus is fine with changing the value anytime it feels like. That’s not a generically viable strategy for a value provided by the standard library.

Additionally, before I sent that email, I looked at a number of the uses, and it appeared as though a great many could be easily modified to use a runtime-determined alignment.

That would be useful feedback on the paper… prior to it getting into C++17. The committee’s POV voting the paper in was that having a constexpr value was something we wanted, and so that’s what we have. At this point in time I’d like to focus on implementing C++17 as it is, and / or filing DRs as required.

Sure. I’m not on the committee. Even if I was, I certainly don’t know that I would have identified the problem…

But now that it has been identified, there’s a choice of what to do. And not implementing the function (and presumably filing a DR saying so) is seeming like a pretty reasonable option.

The committee discussed ABI issues (Jacksonville 2016) and decided that they’d rather have them than have a proliferation of #define SOMETHING 64. That discussion occurred with Google folks in the room, it might be higher bandwidth to consult with them? The notes are unfortunately quite sparse for that discussion.

The libc++ community shouldn’t decline to implement a feature without bringing concrete feedback to the committee. Without such feedback, I’d like to move forward with an implementation plan, because we should offer full C++17 support regardless of our distaste for specific features. I’ve received good feedback on the thread so far, I’m happy to leave the discussion open for a bit, talk to committee people next week in Rapperswil, and unless feedback goes the committee’s way I’d like to pursue an implementation. Does this sound fair?

There’s been 3 options discussed so far – I’m not sure which (#1 or #2) you’re now proposing to implement.

It’ll depend on what people think of these options next week.

Finkel_Hal_J · May 30, 2018, 8:58pm

I’ll add to option 3: Cause many downstream organizations to carry out-of-tree patches to implement the feature. -Hal

dblaikie · May 31, 2018, 5:43pm

[+echristo because he’s been thinking about some of these things (especially those highlighted in (1)) since implementing the target attribute support & looking at how to build code optimized for specific subtargets]

androm3da · October 20, 2018, 9:08pm

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

JF_Bastien2 · October 23, 2018, 8:09pm

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

I haven’t heard other feedback from the dev meeting. Seems folks are happy with the 9 point plan?

zygoloid · October 24, 2018, 12:34am

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

I haven’t heard other feedback from the dev meeting. Seems folks are happy with the 9 point plan?

Sure; given the constraints and the intent of WG21 it seems like it’s the best we can do.

It seems novel to me to use a builtin function rather than a predefined macro, but I’m not fundamentally opposed. Have you thought about whether the value should be exposed to preprocessor conditionals? (That seems like the big difference between using a macro and using a builtin for this.) Related: have you talked to WG14 about such a feature? (I’d guess – but I don’t know – that they’d want to expose the value as a macro, and allow its use in preprocessor constant expressions.)

echristo · October 24, 2018, 12:56am

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

I haven’t heard other feedback from the dev meeting. Seems folks are happy with the 9 point plan?

Curious how you’re planning on dealing with things like subtarget support, inlining, and code generation.

-eric

JF_Bastien2 · October 24, 2018, 3:16am

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

I haven’t heard other feedback from the dev meeting. Seems folks are happy with the 9 point plan?

Curious how you’re planning on dealing with things like subtarget support, inlining, and code generation.

On ARM there’s already some sub target knowledge of this today, and we’d need to honor it. I take it you mean per-function sub-target when you say inlining?
Code generation is unaffected since it’s constexpr… but that’s not what you’re asking

JF_Bastien2 · October 24, 2018, 3:17am

Resurrecting this thread after the lightning talk on the matter. IIRC the TL;DR might be something like “the proposal is no worse than the status quo, despite its drawbacks.”

Was there further discussion at llvm-dev? Any closer to consensus?

I haven’t heard other feedback from the dev meeting. Seems folks are happy with the 9 point plan?

Sure; given the constraints and the intent of WG21 it seems like it’s the best we can do.

It seems novel to me to use a builtin function rather than a predefined macro, but I’m not fundamentally opposed. Have you thought about whether the value should be exposed to preprocessor conditionals? (That seems like the big difference between using a macro and using a builtin for this.) Related: have you talked to WG14 about such a feature? (I’d guess – but I don’t know – that they’d want to expose the value as a macro, and allow its use in preprocessor constant expressions.)

I have not, I can add it to the list of things I need to write WG14 papers for. My experience with WG14’s mailing list hasn’t been great when pointing out simple things WG21 did related to atomics, so I don’t think a mere email will do.

ldionne · March 24, 2022, 11:02pm

I wanted to ping this old topic to say that we will be implementing this feature in libc++ on GCC, since they provide the necessary macros to do so (__GCC_DESTRUCTIVE_SIZE and __GCC_CONSTRUCTIVE_SIZE).

Is this feature perfect? No, but it does have a purpose, it was standardized and it was implemented by the other major standard libraries, so we’ll implement it as well. Not doing so just hurts our users by making their code less portable, and it makes us drag behind on our implementation of the Standard.

Also, C++ has many other things that make it much easier to break ABI than having a constexpr value that depends on the target. Concretely, if someone’s using hardware_FOO_interference_size, they probably know about ABI, and they probably understand that this value can change depending on the target (that’s the whole point!). If they use that property in a type that’s ABI facing, it’s unfortunate but it’s kind of their fault.

It would be awesome if someone fluent with Clang could go ahead and implement a builtin or macro for us to use, otherwise we’ll have to ship the feature on GCC only, which isn’t great. Concretely, Clang should probably match what GCC’s doing to keep us ABI-compatible on equivalent targets.

P.S.: If this has been implemented already in Clang and I missed it, please let me know and ignore all of the above!

jyknight · March 25, 2022, 2:43pm

It’s not changing depending on the target which is the big issue, but that it will change between compiler releases (e.g. when new CPUs are released).

In GCC’s implementation, they make it even more unstable, changing it based on the -mtune CPU selected. (If it’s gonna be ABI-unstable anyhow, might as well also be ABI-unstable w.r.t. tuning!)

GCC did implement some features to help people avoid the situations that will cause the ABI breakages due to the unstable nature of these constants:

Default-on -Winterference-size, which triggers if you use std::hardware_destructive_interference_size in a constant-evaluated context in a header or C++20 module export.
Options to override the default values (--param destructive-interference-size and --param constructive-interference-size), in case you want to explicitly stabilize them for your build.

See the commit that added it https://github.com/gcc-mirror/gcc/commit/76b75018b3d053a890ebe155e47814de14b3c9fb

And the review threads:
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575392.html

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579181.html

That seems like a reasonable compromise, and I won’t object if clang follows that precedent – though I do still wish this wasn’t standardized in the first place.

(On a sidenote, I see that even despite this problem being discussed multiple times, the maintainers being fully aware of this problem, and the default-on warning, https://github.com/gcc-mirror/gcc/commit/b52aef3a8cbcc817c18c474806a29ad7f3453f6d still managed to get into libstdc++ adding a use of hardware_destructive_interference_size to the public ABI. This was then noticed and fixed only a few months later in https://github.com/gcc-mirror/gcc/commit/0e90799071ee78f712f3b58fca7000bc0a258ade after jwakely happened to notice the warning output on stdout for a test-case which happened to be run with -Wsystem-headers enabled.)

pogo59 · March 25, 2022, 3:56pm

Drive-by comment:
I can’t help but notice this, from base/port.h in the Appendix of the paper:

// Cache line sizes for ARM: These values are not strictly correct since
// cache line sizes depend on implementations, not architectures.  There
// are even implementations with cache line sizes configurable at boot
// time.

Suggesting of course that a compile-time constant isn’t general enough. Oh well.

h-vetinari · June 23, 2022, 7:53am

For those following this thread, this has now landed as GCC-only in [libc++] Implement P0154R1 (Hardware inference size) · llvm/llvm-project@56a33ba · GitHub

Topic		Replies	Views
[RFC] Stop supporting extern instantiations with GCC C++	21	1532	July 3, 2023
Clang builtins for C++20 STL features Clang Frontend	19	438	May 2, 2019
libc++ is not using always_inline anymore! C++	15	229	November 13, 2018
Proposing a new 'alloca' parameter attribute to implement the Microsoft C++ ABI LLVM Dev List Archives	31	253	August 3, 2013
libc++ std::cout alignment trouble (was: Re: [llvm] r240144 - [SLP] Vectorize for all-constant entries.) Clang Frontend	18	234	January 13, 2016

[RFC] C++17 hardware constructive / destructive interference size

Related topics