2022-07-21 LLVM libc, libc++
Participants
- Alex Brachet
- David Finkelstein
- Guillot Tony
- Johannes Doerfert
- Michael Jones
- Mikhail Maltsev
- Nigel Perks
- Petr Hosek
- Prabhu Rajasekaran
- Simon Butcher
- Simon Wallis
- Siva Chandra
- Stephen Hines
- Tue Ly
- Peter Smith
- Volodymyr Turanskyy
Agenda
- Libc
- Libc for GPU
- Libc++
Discussion
(Peter Smith) LLVM libc HAL (hardware abstraction layer) investigation
-
Investigation done.
-
Peter will write up the overview of different approaches in Discourse.
-
newlib/picolibc and Arm libs approaches were analyzed.
-
HAL in both libraries is split into:
-
Bootup code (stack, heap init) - may be not included in the lib and provided by the user. Very HW dependent, may need assembly code.
-
IO - these libs have different approaches: newlib has sys calls similar to POSIX ones for retargeting (newlib has ~20 functions to reimplement to retarget), Arm libs has just a set of lower level routines to implement matching higher levels ones.
-
malloc - linker script needs to allocate some memory region for malloc to use.
-
-
Embedded systems can implement semihosting via debug interface or a serial port or such.
-
Next investigation step is to map the above to LLVM libc design.
-
Siva: threading abstraction layer should be considered too as a part of HAL. Agreed.
-
LLVM libc already has a level of abstraction for users to implement to retarget, including platform specific hooks for IO.
-
LLVM libc malloc: approach is not to do anything special for it, but the platform can reimplement malloc.
-
Questions: bootcode and device access code - is there a standard or such that we can adopt in LLVM libc? If there is no standard, how useful is it to come up with your own HAL? Answer: newlib can be considered a de-facto standard (libgloss is the implementation of HAL). It exists for more convenience to separate the retargeting code.
-
init for arrays and constructors is missing in libc, but is expected to be committed very soon.
-
May be easier to try to build LLVM libc for mare-metal to see how it comes together. Starting with a semihosting implementation would be easiest for debugging/testing.
-
May be useful to have some demo code in addition to Discourse discussions.
-
tests/integration_tests in libc project use libc own startup code, init is still missing though.
-
Petr: inits and finis exist in compiler_rt so potentially may be reused.
(Johannes) LLVM libc and libc++ for GPUs
-
GPUs will need libraries in the future.
-
These will probably not be standard compliant, but will face the same kinds of issues as embedded libs do.
-
How the above HAL maps to GPUs?
-
No need in startup code, it is already handled or not needed.
-
IO support (e.g. printf can be available), malloc may be available or not, etc.
-
Will need to compile the library code to LLVM IR then LTO it with the user code.
-
There exists a math library, but it is not in upstream since it is not clear where to put it. Option would be to also build libc for GPUs. As a risk this may bind the math library to LLVM project and libs instead of being compatible with LLVM and multiple other options users can be using now.
-
Function definitions in the header files must match the implementation on the device, thus may run into issues if the host header files are different.
-
If headers only were declarations without definitions that would be easier. E.g. the host may have one assembly instruction implementation of some functions that would not work on the device.
-
Example of issue is mismatch of object size on the host and device. GPUs try to match data layout of the host to allow for easy data transfer between host and device.
-
Is this an ABI question?
-
OSes still allow different definitions of say long even on the same actual host hardware.
-
Special headers may be used as overlays over the host headers to solve conflicting definitions.
-
In the original Arm ABI there was a question if it is possible to link objects compiled with different compilers and their own headers, there was a solution with a set of portability macros. Portability is at odds with performance. In reality, this approach was not properly implemented. In practice, most things work except for cases like long jump and other more complex constructs. Peter will try to find and share a link to the relevant ABI document.
-
Users always expect that everything that works on the host would be able to work on the device, this is a difficult expectation to meet. This is why the desire to reuse host headers as much as possible.
libc++
Topics that we want to discuss in the future calls:
-
How to configure libc++ builds to exclude non needed functionality (i.e. unused code to minimize code size)?
-
Would be good to have a size optimized version (vs perf optimizaed now), e.g. separate config for a different trade off for things like string/int conversions.
-
Both libc and libc++: how to distribute them, libc is built from source, libc++ is used as binary - would libc++ source builds be beneficial as well? For embedded especially? Allows fine tuning build options. There are examples to learn from, e.g. Risc-V toolchains (e.g. from Embecosm) are said to be able to compile libraries on demand. Should this question be part of multilib discussion?
-
Build systems: CMake has a set of predefined configs, multibuild generators can be used to build libc++ in multiple configurations.
-
Fuchsia toolchain ships a number of variants of libs, the multilib logic is hardcoded inside the driver. Maybe moved out into a base class to reuse or even an external configuration file.