[RFC] Changes in libcxx for 16 bit architecture

Sebastian.Perta · October 18, 2024, 1:44pm

Hello all,

This is related to the RL78 backend which I want to upstream ([RFC] Upstreaming Renesas RL78 MCU backend)
As suggested there I will discuss all the controversial changes.

RL78 is a 16 bit architecture, because of this changes are needed in libcxx to be able to compile it for RL78.

The first change which I would like to talk about here is in libcxx/include/__bit/endian.h:

#ifdef __RL78__
enum class endian  : long {
#else
enum class endian {
#endif

Without this change I will get the following error for both little and bit:
enumerator value evaluates to 57005 (64206 for big), which cannot be narrowed to type ‘int’ [-Wc++11-narrowing]

You can see this changes in my current sources as well (line 22):

github.com

SebastianPerta/llvm-project/blob/current_sources_llvm_17/libcxx/include/__bit/endian.h

//===----------------------------------------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef _LIBCPP___BIT_ENDIAN_H
#define _LIBCPP___BIT_ENDIAN_H

#include <__config>

#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
#  pragma GCC system_header
#endif

#if _LIBCPP_STD_VER >= 20

_LIBCPP_BEGIN_NAMESPACE_STD

This file has been truncated. show original

My questions are:
Are such changes acceptable in libcxx?
If no, what could I do to make acceptable?

if yes what will be most acceptable way:
-like I did above: guard the change for RL78 specifically?
-guard it using a more generic define (__SIZEOF_INT__)? which will make OK for other targets as well.
-preferably no guard at all?

I intend to list the rest of changes related to RL78 being 16 here as well as I think the same discussion/conclusion will apply to all of them. Please let me know if this is not OK.

For other types for changes I will create new RFCs.

Thank you!

Best Regards,
Sebastian

philnik · October 20, 2024, 5:20pm

In general such changes are OK. Though the exact changes would be better discussed in an actual PR. However, this would definitely require a CI job since libc++ currently doesn’t support any 16 bit platform (See Adding New CI Jobs — libc++ documentation, or using GitHub actions works too - looks like we need to update our docs). Also, what kinds of guarantees do you need (e.g. ABI stability)?

P.S. you don’t need to create a new RFC for every change. A general “I want to support this platform in libc++” RFC is enough.

Sebastian.Perta · October 22, 2024, 10:46am

In general such changes are OK.

Thank you!

However, this would definitely require a CI
Yes, I intend to do this for the whole LLVM port not just libcxx, I intend to provide build machines to work as build workers.

(See Adding New CI Jobs — libc++ documentation

Thank you!

Also, what kinds of guarantees do you need (e.g. ABI stability)?

Regarding ABI no I have no special needs for now, whatever libcxx does is OK for me as well.
More generally, I can say libcxx is a special case for me (compared to the rest: llvm, clang, compiler-rt and lld): RL78 is quite small and even a simple test case which simply declares a std::vector and adds a few elements can quickly run out of memory on RL78 so this has limited usability for us which so it will a little bit unfair to require to many guarantees; I don’t want to hinder libcxx development in any way just for the sake of maintaining support for RL78. One such example that comes to mind is when I updated from llvm(and libcxx in this case) 10.0.0 to 17.0.1 I had quite a few issues with the ryu folder (which was not present in ver. 10.0.0): the newly added tables in the ryu folder were simply to big for RL78. In such cases I think I can decide if the new feature can/should be supported for RL78. Already supported stuff will be guaranteed by the CI job, as you suggested.

P.S. you don’t need to create a new RFC for every change.

After looking further into my libcxx changes, I don’t think I need to create more RFCs for now since they are similar to this one (and you said “such changes are OK”):
-most common change is int to long (in case of compiler-rt I made the change from int to si_int (and ui_int for unsigned); this could also be be a discussion point if I should declare si_int, ui_int in libcxx similar to compiler-rt.
-in a few cases I needed to add UL or L suffix.
-I also needed to write template specializations for 16 bit, for example: __murmur2_or_cityhash<_Size, 16> in hash.h
-a few extra casts
-split some tables or change their address space (all those changes are guarded with #ifdef __RL78__) due to memory constraints.
-changes in tests as some are quite big to fit on RL78 devices.

ldionne · October 22, 2024, 5:53pm

As a general guideline, we try to add general-purpose #ifdefs instead of platform specific ones as much as possible. So for example, the barrier to adding a #ifdef __RL78__ is a lot higher than the barrier for adding something more general like a check for a 16 bit target. That’s because we support libc++ on a large number of platforms and our code would be a complete mess if we accepted platform-specific branches everywhere.

There are a few exceptions to this, like in <locale>, where we have a few large platform-specific ifdefs, but those are the exception, not the rule.

What kind of C Standard Library support do you have on that platform?

brunodf · October 22, 2024, 9:48pm

Sebastian.Perta:

The first change which I would like to talk about here is in libcxx/include/__bit/endian.h:
#ifdef __RL78__
enum class endian  : long {
#else
enum class endian {
#endif
Without this change I will get the following error for both little and bit:
enumerator value evaluates to 57005 (64206 for big), which cannot be narrowed to type ‘int’ [-Wc++11-narrowing]

I’m curious if there is a reason why libc++ is using these particular constants. This goes back to commit Implement P0463R1: 'Endian just Endian'. Reviewed as https://reviews.… · llvm/llvm-project@a277d02 · GitHub

In clang, similar macros uses different values that would fit in 16-bit integers:

github.com

llvm/llvm-project/blob/c77d8edf80570f450122a7687100553836149652/clang/lib/Frontend/InitPreprocessor.cpp#L1079


      
          if (LangOpts.FastMath)
            Builder.defineMacro("__FAST_MATH__");
          
          // Initialize target-specific preprocessor defines.
          
          // __BYTE_ORDER__ was added in GCC 4.6. It's analogous
          // to the macro __BYTE_ORDER (no trailing underscores)
          // from glibc's <endian.h> header.
          // We don't support the PDP-11 as a target, but include
          // the define so it can still be compared against.
          Builder.defineMacro("__ORDER_LITTLE_ENDIAN__", "1234");
          Builder.defineMacro("__ORDER_BIG_ENDIAN__",    "4321");
          Builder.defineMacro("__ORDER_PDP_ENDIAN__",    "3412");
          if (TI.isBigEndian()) {
            Builder.defineMacro("__BYTE_ORDER__", "__ORDER_BIG_ENDIAN__");
            Builder.defineMacro("__BIG_ENDIAN__");
          } else {
            Builder.defineMacro("__BYTE_ORDER__", "__ORDER_LITTLE_ENDIAN__");
            Builder.defineMacro("__LITTLE_ENDIAN__");
          }

philnik · October 22, 2024, 9:52pm

I don’t think so, but it’s now part of the ABI.

Sebastian.Perta · October 23, 2024, 11:07am

we try to add general-purpose #ifdef s instead of platform specific ones as much as possible

Thank you! I will replace __RL78__ with __SIZEOF_INT__ where possible.

What kind of C Standard Library support do you have on that platform?

Currently using newlib (and newlib-nano), however, I intend to switch to llvm-libc at some point.

jwakely · October 30, 2024, 10:23pm

The maximum value is 0xFACE which fits in 16 bits, just not in a signed 16-bit int. So couldn’t you use unsigned instead of long? That way the enum would still have the same size as it would have had with an underlying type of int.

Alternatively you could use int32_t which would probably be int on 32-bit targets and probably be long on 16-bit targets. But just using unsigned seems better to me.

enum endian
#if __INT_WIDTH__ == 16
: unsigned
#endif
{
  ...
}

Topic		Replies	Views
ABI break in libc++ for a 17.x - guidance requested LLVM Project policy , release	22	2095	January 22, 2024
Proposal: Managing ABI changes in libc++ Clang Frontend	14	154	August 3, 2015
[RFC] C++17 hardware constructive / destructive interference size Clang Frontend	45	2259	September 17, 2024
[libcxx] RFC: C++14 and beyond Clang Frontend	16	227	May 7, 2013
[libc++] Porting libc++ to embedded targets/compilers Clang Frontend	10	257	September 13, 2017

[RFC] Changes in libcxx for 16 bit architecture

Related topics