We had a round table discussion on at EuroLLVM last week. This post is my recollection of the discussion. If anyone else who was there would like to add their thoughts please add to the thread.
Before I start I would like to advertise the LLVM Embedded Toolchains Working Group sync up. This virtual round table discussion occurs once every 4 weeks, all are welcome. Details can be found in LLVM Embedded Toolchains Working Group sync up and
Getting Involved — LLVM 17.0.0git documentation
Reviews and Requests for features
A lot of ideas came up on the round table. A lot of the developers aren’t working directly with projects that need specific features. This detachment makes it harder to know how to prioritze what to work on. If you have an opinion, even if it is just what you would like to see then please do let the developers know on discourse or in the LLVM Embedded Toolchains Working Group sync up.
We are also interested in feedback on work in progress patches. As an example, there is an RFC out for a data driven multilib [RFC] Multilib we are looking for people to try this out and leave feedback, even if it is just we tried it and it worked!
There is a similar set of patches for MC/DC code coverage in ⚙ D138849 MC/DC in LLVM Source-Based Code Coverage: clang
Advertising LLVM Toolchains for embedded systems
A number of presentations (Fosdem and Embo++) have been made this year about LLVM in embedded systems, however this is still only a small audience. A community
blog post with an example of porting an open-source project currently building with a GCC toolchain to a LLVM based toolchain would be a useful starting point.
Collaboration with GNU
Many embedded projects that use open-source tools need to support LLVM and GNU toolchains. Wherever possible we should work with the binutils community to get new features adopted by both communities.
Upstream Testing
While there are a number of teams doing downstream testing of LLVM for embedded targets, there are no upstream build-bots for many embedded targets. There is an opportunity to test compiler-rt builtins which should be straightforward. There are opportunities for libc++, libc++abi and libunwind, with the proviso that some features will need to be disabled that the targets can’t support.
Downstream testing could be helped by some sample embedded configurations that downstream toolchains can adapt for their use case.
Documentation
Cross-compiling the runtimes such as compiler-rt and libc++ can be quite difficult to work out. Our existing documentation is often out of date, particularly with the introduction of the runtimes build. It would be useful to retire or update the documentation.
The LLD documentation https://lld.llvm.org/ is largely developer focused. In particular it would be helpful to update and expand the linker script differences in Linker Script implementation notes and policy — lld 17.0.0git documentation . Other known differences between GNU ld and LLD would be helpful for users adopting LLD. It would be realistic to do this incrementally
Use of LLVM libc in embedded systems
Some projects are already using LLVM libc. Not all C functions are supported, but many projects only need a small number of functions and these can be implemented on demand. The LLVM libc developers are interested in which functions are needed first.
Some llvm-libc functions such as printf have been optimized for size, and can compile down to a very small size.
Linker Overlays
LLD supports overlays in the same way that GNU ld does. This requires manual assignment of sections to overlays and an overlay manager to switch between the overlay. There are ways to make these easier to use, Arm’s proprietary toolchain has an automatic overlay feature which inserts code to switch overlays automatically: Documentation – Arm Developer . Something like this could be a useful feature in LLD for the projects that need it.
It was mentioned that there is an effort called ComRV to standardize overlays. The links that I was able to find:
- GitHub - riscv-software-src/riscv-overlay: The Software Overlay TG will specify the requirements for the software overlay feature, both from the FW manager engine and from toolchain aspects, all which will be based on the current RISC-V ISA and extensions.
- https://riscv.org/wp-content/uploads/2020/04/RISC-V_ComRV_v3.2.pdf
LLD trace options
LLD has a small number of tracing options such as --verbose, --why-extract=, --warn-backrefs and --trace-symbol=. These are very helpful in tracking down problems in their specific areas. There are other areas where there is nothing beyond looking at the map file, assuming the link got that far. Some more tracing, particularly in the area of linker scripts could help users and developers alike. Something like llvm --print-after would be very useful. The challenge would be designing the output in a structured way as the inputs to the linker can be very large. Making sure additional trace is integrated without being spread over the code-base also requires thought.
LLD map file output
LLD and GNU ld have different map file output formats. While one is not necessarily better than the other, having the option to have a similar output format will help projects migrating from GNU ld to check differences. Going a step further, a machine readable map file in something like JSON would make it easier for other tools to consume and analyze.
llvm-objcopy
Support for Motorola srec format (GNU objcopy -O srec) would be useful for some projects wanting to transition from GNU.