In my last mail, and in some F2F discussions, I volunteered to write a first patch to use a Freestanding configuration. My motivation is to decouple non-lock-free atomic from platform libraries, from libatomic.a, for Freestanding implementations.
I’ve gone ahead and prototyped this a few different ways. Here is the approach I would like to submit a patch for.
= Background =
To the best of my ability to tell:
I think the most robust solution is for your compiler to implement C11 _Atomic, and for that type to be sufficiently large to hold your value and your spin lock. I think that most of the libcxx implementation will “just work” at that point, although there are possibly some hidden bugs where something mistakenly assumes that sizeof(T) = = sizeof(_Atomic(T)).
Making the compiler generate _Atomic correctly also gives you a lot of C compatibility.
If C11 compatibility isn’t a big concern, and / or getting that feature into the compilers that you need isn’t an option, then your approach sounds fine to me. I don’t make the decisions around here though
For the binary compatibility concerns… I don’t have great answers. On the one hand, I want to believe that the choice of freestanding or hosted is a platform decision, and that you shouldn’t ever mix freestanding object files with hosted object files. On the other hand, an awful lot of people that aren’t doing embedded, drivers, or GPU programming seem interested in freestanding, and I’m afraid that they may very well want to build one chunk of code with a freestanding flag as a style checker / optimizer, and another part of their code with hosted. I’m not sure if it make sense to try and support that second group.
Another binary compatibility note… my search skills are failing me, but I recall seeing some blog post or documentation that basically states that, when in doubt, you should generate a library call. You can migrate from a library call to inline code over time, but you can’t migrate the other direction without breaking compatibility. I don’t think that affects the specific design questions you had though.
The best reason I can see to layer libcxx atomics on the C11 atomics is that there’s probably no other way to ensure interoperability between C1* and C++1* for toolchains that want to do that. I agree with your assessment that a toolchain using embeddeds lock for C1* _Atomic(T) will just work with libcxx – I haven’t any doubt.
There are at least two issues RE: _Atomic(T) where my situation is concerned:
Are you hoping to enable this just on the device side, or also on the host side? If it is just on the device side, then the ABI is yours to choose and/or break. If it’s on the host side, that gets trickier. I’m also wondering if you are trying to make it possible for the layout of std::atomic to match between host and device to enable cross domain synchronization.
If your work extends to the host side of things, then it seems like you are falling into the camp that wants to mix freestanding and hosted in the same application, and that feels like an ODR trap waiting to spring.
Summarizing my earlier mail about our intentions:
a. Device-only isn’t interesting. It’s got to be host+device or it doesn’t do enough to help programmers write the programs they want to write.
b. We will create a Freestanding C++ library that conforms in every way except the namespace will be decorated. It won’t be std::, it might be cuda::std::, so you can still use your Hosted library as you used to. We’ll verify that ODR issues are addressed by this and other engineering we’ll do.
c. Someday, in the far future when this has worked out, we can talk about Enabling our paths for Hosted and/or in std::, but IMHO that requires the nuclear option of creating new ABI triples like « x86-64-cuda-linux » only compatible with « x86-64-linux » at the C boundary. It’s not for 2019 to say the least, except maybe as a toy.
Finally, I’m volunteering (myself and others) to contribute maintenance and enhancements but our delivery will have some ugly bits we don’t upstream. I’m trying to be most useful to upstream though, as possible.
Happy New Year,
Makes sense to me. Different namespace means ODR issues generally go away.
For atomic specifically, it may make sense, long term, to standardize something along the lines of a shared_memory_atomic. On some platforms, that may just be an alias of atomic, but that would allow coexistence without as severe of an ABI break.
I just saw this, hence the late reply. At a high level, I'm fine with the proposed approach but I'd like to discuss it over a review to get things more concrete. Note that I'll only be back on Jan 21 though so don't expect a lot of feedback from me until then.