In the last LLVM Embedded Toolchains Working Group sync up I mentioned that I’d write up a bit about initialization code for embedded systems as it exists in some C-libraries, and how it might be applied to
llvm-libc at some future point. Apologies for leaving it till the day of the next meeting to write it. I’ve restricted this post to just initialization which tends to be provided as objects and not as a library.
LLVM libc has for Linux llvm-project/libc/loader/linux at main · llvm/llvm-project · GitHub the startup code that initializes the environment for the library. Performing actions like processing the .init_array, setup up TLS etc.
In an embedded context where the program may be running immediately after reset, with no OS, the startup code has to do some additional work. Typically doing something like:
- Setup a stack pointer.
- Copying data from flash to RAM.
- Zero initialization.
- Reserving a block of memory for the heap.
In amongst this there can be additional hardware initialization such as enabling floating point units, caches, memory protection units. This is where the library initialization blends with bootcode though. The complexity of bootcode varies considerably, it can be fairly simple for a microcontroller, or very complicated for a multicore CPU with cache and MMU.
For embedded systems it is not possible to provide universal initialization code for all cases. However there is a common structure that can work out of the box for simple systems, and can be adapted for more complex examples.
As an example picolibc has a very simple crt0.o that works with the default linker script that comes with the library. A user would need to modify it for more complex examples but just having the structure
helps get people started. The default case is also useful to run tests on models like qemu, usually with semihosting for IO.
The control routine is called __start() (picolibc/crt0.h at main · picolibc/picolibc · GitHub)
it performs the actions:
- Copy data from flash to ram, using symbols from the linker script.
- Zero initialize .bss.
- Initialize TLS (local exec) if compiled to use it.
- Call routines from .init_array
- If semihosting (environment from debugger) setup argc and argv
- Call main()
- Call exit() if supported.
The _start routine is provided per architecture. For example AArch64 picolibc/crt0.c at main · picolibc/picolibc · GitHub sets up the stack and FPU before calling __start.
I think for LLVM libc there could be something similar for embedded systems. Essentially a simple crt0 that does the bare minimum to get a system working with a simple linker script that can be provided as a sample. That would enable someone to build a simple hello world program, and run it on qemu. It could also be used as the basis of a buildbot that runs tests,
An alternative is to not provide a crt0 but instead provide enough documentation about how to write one.