Hello,
I’m working on implementing builtins in GCC to directly specify the
behaviour of the C++20 std::atomic<float>::fetch_add
(and similar).
(Point being floating point types).
Summary of what I’m looking for:
- I would like to find agreement on what resolved
__atomic_fetch_add
function names to use when implementing the operations with libcalls. - I’m hoping to find agreement on how libstdc++ should check these new
resolved versions of__atomic_fetch_add
(and similar) are available.
The goal here is that some GPU’s can perform these operations directly,
and the current set of GNU atomic builtins can not specify the operation
directly. This means that compilers using the current libstdc++ headers
while targeting these GPU’s (e.g. with something like the -stdpar
option) can not produce optimal code.
With these builtins defined, libstdc++ can implement the
std::atomic<float>::fetch_add
(and similar) functions using
__atomic_fetch_add
in the same way as is done for other types, and hence
compilers can easily lower this to optimal code.
As part of that work we need to define an ABI to call libatomic or any
other library implementation of the operation (following the example of
all other GNU atomic builtins), and we also need to decide how libstdc++
can check for the availability of these builtins. Since both of these
choices are visible outside the GNU project I would like to double-check
with the LLVM community if the current proposals work for you too.
I’m proposing names following the existing convention in GCC of defining
resolved versions of the builtins to match a libatomic ABI entry point.
The existing examples have the format: __atomic_add_fetch_<suffix>
.
Currently the suffixes used are for different sizes of datatypes
{1,2,4,8,16}. This can not work for floating point types since
different types can have the same size (e.g. bf16 and f16). Hence I’m
proposing the following suffixes:
Type GCC internal name Suffix
float N/A f
double N/A <no suffix>
long double N/A l
std::bfloat16_t __bf16 f16b
std::float16_t _Float16 f16
std::float32_t _Float32 f32
std::float64_t _Float64 f64
std::float128_t _Float128 f128
N/A _Float32x f32x
N/A _Float64x f64x
N/A _Float128x N/A <not implementing>
These suffixes follow the existing convention in GCC builtins taking
floating point types like __builtin_acosh
. From scanning through the
source code it seems that this convention is matched in LLVM, though
LLVM does not implement all floating point types for such functions
(with double, float, long double, and float128 implemented).
Again from scanning through the source code it seems that LLVM also uses
the existing {1,2,4,8,16} suffixes for library calls to implement atomic
operations (as one would expect since this is a platform ABI).
As far as I can see clang doesn’t seem to accept these as builtins
specified in the source – is that correct?
While libstdc++ will only use the overloaded version of each operation
(i.e. __atomic_fetch_add
), it needs some way to determine whether the
current compiler can use that overloaded builtin on the relevant
floating point type.
So far it seems that the cleanest way for libstdc++ to determine this is
to check against the existance of the resolved builtins with the above
suffixes.
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663377.html
That implies that compilers will need to agree on these names for the
resolved versions of the atomic builtins both at the ABI level for
libatomic, and at the API level for libstdc++.
Does anything in the names proposed above sound like it might be
problematic?
Do the LLVM community dislike the proposed exposure of the resolved
floating point atomic builtins to the user (i.e. requiring
__has_builtin(__atomic_fetch_add_fp<suffix>)
to work)?
IIUC LLVM does not currently expose any other resolved atomic builtins
to the user.
While it would not be necessary to continue being functional, the
current suggestion would require such exposure by any compiler if it
wants to use the libstdc++ codepath implementing
std::atomic<float>::fetch_add
with __atomic_fetch_add
.
If some other advertisement mechanism would be preferable to the LLVM
project please do mention it.
Regards,
Matthew