LLVM Embedded Toolchains Working Group sync up

Introduction

LLVM Embedded Toolchains Working Group sync-up calls and meeting minutes are tracked here as replies below. Migrated from Google Docs.

Meeting Details

On Thursdays at 9am PST/5pm BST/6pm CET every 4 weeks starting on Mar 3rd, 2022.

Join Zoom Meeting

Meeting ID: 932 9643 6903

Passcode: 918316

One tap mobile

+442034815240,93296436903#,*918316# United Kingdom

Dial by your location

+44 203 481 5240 United Kingdom

+1 346 248 7799 US (Houston)

+1 408 638 0968 US (San Jose)

+1 646 518 9805 US (New York)

Find your local number: https://armltd.zoom.us/u/adPiWtNQFa

Calendar Invites

Link to share the ICS file and direct link.

Link to shared Google Calendar (gcal).

2 Likes

Past meeting minutes migrated from Google Docs:

2022-03-03 Multilib support

Agenda

  1. Requirements and design options for multilib support in clang.
    Multilib toolchains contain multiple sysroot, each having a version of the target libraries for different architecture/ABI variants that are selected based on the target and other command line options.
  2. Profiling support.
  3. AOB

Participants

  1. Volodomyr Turanskyy
  2. Peter Smith
  3. Alex Brachet
  4. David Spickett
  5. Hafiz Abid Qadeer
  6. Marcel Achim
  7. Nigel Perks
  8. Petr Hosek
  9. Stephen Hines
  10. Pirama
  11. Son Tuan Vu
  12. Daniel Thornburgh
  13. Shivam Gupta

Follow up from previous meeting

None

Discussion (minutes)

  1. Multilib support (presentation)

  2. Want to avoid having to add multilibs each time

  3. Tablegen file for generating multilib

  4. GCC implementation isn’t great. Possible for a cleaner solution.

  5. Implementation examples https://github.com/llvm/llvm-project/blob/bd1917c88a32c0930864d04f4e71155dcc3fa592/clang/lib/Driver/ToolChains/Gnu.cpp#L1512

  6. https://github.com/llvm/llvm-project/blob/bd1917c88a32c0930864d04f4e71155dcc3fa592/clang/lib/Driver/ToolChains/Fuchsia.cpp#L204

  7. Fuchsia Multilib for LLVM runtimes (runtimes exceptions/no-exceptions etc.)

  8. CMake build to generate all the multilibs, only going to work for LLVM build system.

  9. Auto searching will only work for GNU sysroots (libc++ and libstdc++).

  10. Reasonable to go in the direction, need to get the right design, used in several drivers. Each one may have existing constraints. Few iterations, try different designs. Keep several alpha/experimental.

  11. History of downstream forks for embedded. Get people to test it out. Future iterations. Reaching the right audience.

  12. Provide downstream users an opportunity to migrate

  13. Start an RFC in discourse.

Profiling.

  1. Fuchsia not using libclang rt profiling, kernel has its own runtime with shared include file (structs). Linux is thought to do a similar thing.

  2. Problem is that it relies on libc (calls into a few functions). Files, mmaps. Instrument libc itself (recursive case). sanitizer_common has alternative implementations. Proposed rewriting profile runtime on top of sanitizer common.

  3. Breaking up profile into smaller parts.

  4. Fuchsia has a custom runtime for low-level use cases (bootloader and kernel) ​​https://cs.opensource.google/fuchsia/fuchsia/+/main:src/lib/llvm-profdata

  5. There’s currently little reuse between our runtime and compiler-rt runtime, but we are interested in refactoring libclang_rt.profile so we can reuse more parts and reduce the duplication.

  6. Not clear if there is a universal runtime. Can we share the basic things, common logic. Extract into headers in the clang installations. Can do downstream integration for embedded systems.

  7. Symbolization produce a stack trace then invoke llvm-symbolizer, can’t usually do that in embedded system. Fuchsia uses offline symbolization through symbolizer markup, markup goes to serial output, no runtime support for symbolization is needed.

  8. This is already supported in sanitizer_common https://github.com/llvm/llvm-project/blob/b223e5f8468cbed5cffe0d872de8feac2a73b030/compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_markup.cpp

  9. RTEMS has adopted the same format (since removed from compiler-rt, likely bitrotted).

  10. Fuchsia team is interested in implementing support for the markup directly in llvm-symbolizer. Is there community interest?

2022-03-31 Build bots, symbolizer

Agenda

  1. Build and test bots for embedded toolchains.

  2. Background:

1. https://lab.llvm.org/buildbot/#/console
2. https://llvm.org/docs/HowToAddABuilder.html
  1. Intention:
1. Provide pre-commit tests for bare-metal configurations to flag any breaking changes early or preferably even before they land.
  1. Using GNU toolchain driver via the bare-metal driver, see discussion Switching to GCC C runtime linkage for the baremetal driver

  2. Symbolizer markup

  3. Google sent out an RFC to implement symbolizer markup support in llvm-symbolizer: RFC: Log Symbolizer

Participants

  1. Volodymyr Turanskyy
  2. Mikhail Maltsev
  3. Peter Smith
  4. Petr Hosek
  5. Daniel Thorn
  6. Paul Kirth
  7. Pirama
  8. Roland McGrath
  9. Stephen Hines

Follow up from previous meeting

  1. Multilib - Arm team wants to come up with an RFC, busy with a release right now.
  2. Profiling - needs further discussion?

Discussion

  1. Build Bots

  2. There was a question about xcore build bot regarding missing REQUIRED in tests and what does it mean to use the default target for cross-compilation targets.

  3. There are toolchain builders (easy as they are cross compilation toolchains) and runtimes (more difficult since they depend on the target architecture).

  4. Are there cross compilation bots of compiler-rt? Looks like not.

1. Fuchsia team is investigating this now, cannot make to work fully yet. Need to prefix each execution with a tool like QEMU or scp/ssh. Need a way to batch together (build all - copy all - run all)? Otherwise running one-by-one is very slow.
2. How to handle and report accurately test failures when batched?
3. May be possible to run tests on a different model, e.g. Arm M-profile test code on Arm A-profile target which is more capable.
4. Libc++ and libc++abi have a different test runner/wrapper around lit test runner, which is easier to tailor.
5. Difficult would be sanitizer tests, coverage tests.
  1. How to make people in the community respond to failures in these bots?
1. How to provide people with a reproducer and a way to debug their failures? Use emulators?
2. Required to fix vs best practice?
3. At a minimum we must have really good docs, step-by-step how to reproduce a failure.
4. Provide a VM to reproduce and debug, if possible.
  1. Arm Embedded Toolchain for now only provides a smoke test, we want to provide a public bot with much more tests/regular LLVM tests.
1. Internally, Arm Compiler has a patch to the libc++ test runner to invoke the model to execute. It may be possible to upstream.
2. Big part of the lit tests can be run without POSIX, so in bare-metal - would it make sense to define this subset in upstream to make it easier to run/maintain? Sounds like a good idea.
  1. Using GNU toolchain driver

  2. Is this a common use case to add into the LLVM driver itself?

  3. Escaping linker options to avoid the need to double-escape may be useful.

  4. Pigweed library uses LLVM baremetal driver.

1. LLVM baremetal driver does not seem to be very well thought out/designed. There are opportunities to simplify/improve, remove duplication by lifting to the parent class.
2. History of the driver: was started by Code Sourcery, does not seem to have an active owner now.
3. Anyone interested in doing reviews for such refactorings? Arm team (Peter Smith may be a contact, Mikhail Maltsev as well) is happy to collaborate. Hafiz Abid Qadeer was adding RISC-V support to the driver so may be interested too. Petr Hosek will post something for review.
  1. Symbolizer

  2. Petr and Daniel: RFC was sent (link in the agenda):

1. Defer symbolization until later time maybe on a different machine with more debug info.
2. This approach has been used in Fuchsia since 2017, so is proven.
3. Design by Roland McGrath.
4. In Fuchsia info goes out via serial for later processing on a host machine.
5. It is supported in sanitizer runtime for Fuchsia, makes sense to extend support.
6. Whoever finds this interesting, please review the RFC, implementation will follow. As a separate library in the LLVM project.
7. Target binary emits markup elements that describe where segments are mapped, PCs from the stack, i.e. enough context. Some other machine offline can take the markup and generate the traditional stack trace. The markup should be standardised.
8. Arm team is interested in this - will review the RFC.

2022-04-28 LLVM libc

Agenda

  1. Choice of C library for embedded use cases. LLVM libc as an option.
  2. Roundtable at EuroLLVM?

Participants

  1. Peter Smith
  2. Siva Chandra
  3. Alex Brachet
  4. Mikhail Maltsev
  5. Nigel Perks
  6. Petr Hosek
  7. Pirama
  8. Simon Wallis
  9. Volodymyr Turanskyy

Follow up from previous meeting

  1. Build bots - below.
  2. GNU driver - N/A
  3. Symbolizer - N/A

Discussion

  1. Roundtable at EuroLLVM

  2. Petr and Nigel are going.

  3. So far not many people are on the call.

  4. Should we try to advertise in the LLVM Discourse to see if more people are interested?

  5. Lib C options.

  6. How suitable LLVM libc is for embedded toolchains?

  7. Newlib is OK for OSS projects, but not commercial toolchains.

  8. Arm team is looking into LLVM libc see now, apparently there are other people interested as well, suggesting some patches upstream.

  9. Siva works on LLVM libc:

1. Just started looking into potential embedded and bare-metal uses.
2. Happy to learn from embedded experiences.
3. Peter posted a comment on the Arm32 build of LLVM libc - how to build with LLVM Embedded Toolchain for Arm - https://discourse.llvm.org/t/building-llvm-libc-for-arm32/62092
  1. There is a large design space for embedded libc in terms of code size vs performance. Peter will try to summarise in a Discourse post.
  2. Zephyr as an example: uses very minimalistic libc. LLVM libc is now tuned for server performance, people may be most interested in something in between. So it is important that the library should allow tailoring to the specific use case.
  3. Arm Compiler example: we offer two different libraries: standard compliant one (but bigger) and “micro” one that has very limited functionality, but very small.
  4. LLVM libc strives to ideally cover both use cases where standard compliance is important and very minimal code size is important.
  5. The more tuning points there are, the more library variants will need to be built, thus all of them need to be tested. So there is a practical limit to configurability.
1. Pragmatically, there will be some more used configuration vs less used to direct testing.
  1. LLVM libc - how startup is addressed (for bare-metal, e.g. stack setup)? So far it is Linux focused, startup was not addressed yet. More details on requirements/approaches would be useful to share. Usually, hooks for users to define/customise stack and such are provided.
  2. Now many customers ask for POSIX support in embedded. LLVM libc seems to provide some POSIX functions - how easy is it to extend to add more coverage of POSIX for high-end embedded systems?
1. POSIX functions that do not require OS: should be easy to add.
2. POSIX functions that rely on OS: Zephyr and FreeRTOS have a level of support, newlib has some support too. Libc may need something like a porting layer to allow adaptation to a particular OS.
  1. Sanitizer runtimes. Is there an intersection with LLVM libc? Aim is to instrument everything from libc. Fuchsia has its own libc as a fork of musl that includes interaction with sanitizers. The API for sanitizers is very low-level so that libc itself can be instrumented to use sanitizers. There is a discussion about other libc projects adopting the same API, or rather come up with one shared API. Long term plan is to have this API implemented in LLVM libc too.
1. https://cs.opensource.google/fuchsia/fuchsia/+/main:zircon/third_party/ulib/musl/include/zircon/sanitizer.h is the API that Fuchsia uses.
  1. How complete in regard to C standards is LLVM libc now?
1. String is mostly complete, no locale.
2. Maths work is underway to complete it by ~Sep 2022.
3. Aim is to eventually build full compliant library, the team is happy to prioritise what is important for specific use cases with good rationale.
  1. Modular structure: static libc vs missing pieces used from the shared system libc. There may be some subsets that are all-or-nothing, e.g. IO subsystem. There is the “full build” mode that assumes LLVM libc is the only C library.

  2. Build system: native or cross compilation? Now focus only on native builds and tests, but cross compilation is possible.

  3. Let us raise any topics of interest in Discourse.

  4. Build bots

  5. Embedded target vs host for running the tests. Object emission was missing. There is the REQUIRES (restored) directive to guard object emission in tests.

2022-05-26 EuroLLVM, LLVM libc

Agenda

  1. Admin: Move to Discourse for agenda and meeting minutes, example LLVM Pointer Authentication sync-ups - #4 by rjmccall ?

  2. Follow up from EuroLLVM 2022 round tables.

  3. Upstreaming of build scripts for embedded toolchain, potential issues/acceptance criteria.

  4. What embedded use cases can be tested within llvm-project itself without external dependencies, e.g. building compiler_rt for embedded targets. Such tests should be easy to establish within existing LLVM build bots.

  5. 64-bit source code locations to handle AUTOSAR generated code (âš™ D97204 [RFC] Clang 64-bit source locations) lack of interest/reviewers.

  6. llvm-libc update and embedded systems (won’t get time to get through all of these)

  7. A summary of a comparison versus an existing embedded toolchain’s C-library.

  8. Update of where we are with providing requirements for embedded systems.

  9. The parts of llvm-libc that have been implemented so far are independent of each other. Things like printf and locale will need to communicate. Some size optimized embedded libraries do not include locale, have there been any thoughts about implementations that may not include some parts of the library?

  10. libc++ on embedded systems

Participants

  1. Peter Smith
  2. Volodymyr Turanskyy
  3. Simon Cook
  4. Alex Brachet
  5. David Finkelstein
  6. Ed Jones
  7. Lewis Revill
  8. Michael RJ
  9. Nigel Perks
  10. Petr Hosek
  11. Saleem
  12. Simon Wallis
  13. Yvan Roux

Follow up from previous meeting

  1. LLVM libc below.

Discussion

  1. Admin - agreed to move, Volodymyr will take care and update.

  2. EuroLLVM 2022 round tables follow ups:

  3. Libraries discussion, multilib support like in GCC.

1. Simon is working on new multilib configurations in runtime. Would be great to have an RFC.
2. compier_rt with multilib - how to build? Manually?
3. Petr: Fuchsia builds ~50 different multilibs, can share experience.
4. Link from Saleem: https://github.com/apple/swift/blob/main/cmake/caches/Windows-x86_64.cmake#L30-L45
  1. Extension of config files to allow more configurability? RFC would be useful to start the discussion too.

  2. Build scripts for embedded

  3. Embedded toolchain for Arm is in a separate github repo that only has build scripts (and few patches) - would it be useful to try upstream such scripts into LLVM to keep in one place and make it easier to create the build bots?

1. Downloads external (non LLVM repo) library.
2. Builds multiple versions of this library in a loop.
  1. Make use of CMake?

  2. Needs a concrete example to start discussion?

  3. Need to consider Windows: split between runtime and SDK components.

  4. → RFC on Discourse.

  5. What can we test within current LLVM repo in embedded configs

  6. Compiler_rt? Build only? Run on a model?

  7. Libcxx + tests, run on a model.

  8. LLVM test suite - factor out tests that depend on POSIX that is missing in embedded configs.

  9. Would community be willing to keep such buildbots green?

  10. Tests may be difficult to run on real HW, so models are used.

  11. Update the lit tool to batch running the tests? Fuchsia did some prototyping for this, still needs work.

  12. 64-bit source code locations

  13. Lack of reviewers, anyone is interested, please ping in the review.

  14. LLVM libc

  15. Had a closer look, compared to Arm library.

  16. Next, post at Discourse.

  17. How printf and locale should be tied together? May need to be optimized together.

  18. Arm library has modular printf: can send via serial port, can skip a lof of features like FP support.

1. Example, LLVM libc is designed it to let one specify which conversions they want:[ https://github.com/llvm/llvm-project/blob/main/libc/src/stdio/printf_core/converter.cpp](https://github.com/llvm/llvm-project/blob/main/libc/src/stdio/printf_core/converter.cpp)
  1. Clang can optimize more if building specifically for LLVM libc by knowing a bit more about it.

  2. Libc++ on embedded systems

  3. Not including features that are not needed.

  4. AUTOSAR adaptive platform is built on C++14 including libraries, so there is a lot of uptake in use in automotive domain.

  5. Invite the libc++ maintainer to start a discussion about scaling down to embedded systems.

  6. Peter had a discussion with Louis Dionne recently, ideas: subsets can be very useful, e.g. one that does not allocate memory - can these be separated and only relevant tests run? Overall, he seems to be interested in building such support.

  7. Agreed to reach out to Louis and invite him. Volodymyr to organize.

1 Like

Hi,

Apparently, the event in this post shows the time 1 hour off - I cannot edit it now, will try to fix for next time.

The correct time is in the calendar: “On Thursdays at 9am PST/5pm BST/6pm CET”

2022-06-23 LLVM libc, MC/DC coverage

Participants

  1. Nigel Perks
  2. Rouxy
  3. Alan Phipps
  4. Simon Wallis
  5. Daniel Thornburgh
  6. Simon Butcher
  7. Peter Smith
  8. Siva Chandra
  9. Petr Hosek
  10. Shivam Gupta
  11. Volodymyr Turanskyy
  12. Mikhail Maltsev
  13. Claudio Bantaloukas

Discussion

  1. Discourse event time is shown incorrectly at the main page, some people were confused, sorry about that, Volodymyr will see how to fix this.
  2. Libc++ for embedded.
  • Louis was not able to join this time, Volodymyr to check if he will be available to attend next time.
  1. Peter: Follow up on past LLVM libc discussion.
  • After the investigation and internal discussion, Arm team confirmed that LLVM libc looks very promising, so wants to contribute.

  • Some of the topics that we want to touch/discuss upstream below.

  • Diffrernt size variants of some functions like smaller memcpy are needed for MCUs.

  • Customization - how to manage interrelated pieces of functionality?

  • HW abstraction layer - if there is no file system, but there is a serial port or debug interface of semihosting, how to redirect?

  • Process/app startup when there is no OS:

    – Peter: Is this something that libc would include or just document how to add the startup code?
    – Siva: if there is a standard/convention for startup code, then happy to add to the library. Question: How to test it? Need to add buildbots, then how people could debug failures in such a buildbot?

  • Some aspects of startup - setup stack, heap, etc - are architecture specific, but required. Other libs can be used as example, e.g. newlib.

  1. Alan: Upstream support for MC/DC code coverage, which is useful for embedded functional safety applications.
  • MC/DC is Modified Condition/Decision Coverage (Modified condition/decision coverage - Wikipedia), it is required for functional safety code of higher ASIL levels.

  • An implementation will be upstreamed in next few months.

  • Who is interested? WOuld be great to help with code review.

  • Peter: Arm is interested to help with review. Team in Qualcomm probably can be interested too.

  • Will it work for bare-metal? Compiler-rt implementation is not suitable for bare-metal now - should be the next step to add support.

  • Petr: happy to help with code coverage too, Fuchsia team have some experience. Fuchsia uses coverage in the kernel.

    – Now there is a lot of similar code in sanitizers, code coverage, profiler - a refactoring was suggested last year on Discourse (Using C++ and sanitizer_common in profile runtime): profiler runtime different from compiler-rt so suggested to rebuild it in C++ (from C) and on top of sanitizers common code. Sanitizer_common is already good at abstracting from the underlying OS/target. So migrating and improving would improve all use cases together.
    – Fuchsia may work on this in the coming months. Pure refactoring would be useful too, e.g. remove multiple ifdefs, etc - structure the code better.
    – Meta (company) team is doing interesting work in coverage now too. It is intended for mobile phones, e.g. adding boolean coverage mode (instead of counters). Fuchsia team is talking to them to coordinate.

  • Peter: Important point is how to get the requirements defined to fit all use cases from MCU to more high end systems. E.g. if there is small memory, have an interface to write the coverage/profile info to the debugger interface.

  • What are the key changes in the MD/DC coverage patch? Additional level of analysis for boolean expressions + additional counter object to keep track of.

Update on the incorrect event time: Turned out there is a related known issue, so until it is fixed, the event summary panel is removed.

Please use the calendar invite to get the correct date and time.

Sorry about the confusion yesterday and thank you for joining anyway!

2022-07-21 LLVM libc, libc++

Participants

  1. Alex Brachet
  2. David Finkelstein
  3. Guillot Tony
  4. Johannes Doerfert
  5. Michael Jones
  6. Mikhail Maltsev
  7. Nigel Perks
  8. Petr Hosek
  9. Prabhu Rajasekaran
  10. Simon Butcher
  11. Simon Wallis
  12. Siva Chandra
  13. Stephen Hines
  14. Tue Ly
  15. Peter Smith
  16. Volodymyr Turanskyy

Agenda

  • Libc
  • Libc for GPU
  • Libc++

Discussion

(Peter Smith) LLVM libc HAL (hardware abstraction layer) investigation

  • Investigation done.

  • Peter will write up the overview of different approaches in Discourse.

  • newlib/picolibc and Arm libs approaches were analyzed.

  • HAL in both libraries is split into:

    • Bootup code (stack, heap init) - may be not included in the lib and provided by the user. Very HW dependent, may need assembly code.

    • IO - these libs have different approaches: newlib has sys calls similar to POSIX ones for retargeting (newlib has ~20 functions to reimplement to retarget), Arm libs has just a set of lower level routines to implement matching higher levels ones.

    • malloc - linker script needs to allocate some memory region for malloc to use.

  • Embedded systems can implement semihosting via debug interface or a serial port or such.

  • Next investigation step is to map the above to LLVM libc design.

  • Siva: threading abstraction layer should be considered too as a part of HAL. Agreed.

  • LLVM libc already has a level of abstraction for users to implement to retarget, including platform specific hooks for IO.

  • LLVM libc malloc: approach is not to do anything special for it, but the platform can reimplement malloc.

  • Questions: bootcode and device access code - is there a standard or such that we can adopt in LLVM libc? If there is no standard, how useful is it to come up with your own HAL? Answer: newlib can be considered a de-facto standard (libgloss is the implementation of HAL). It exists for more convenience to separate the retargeting code.

  • init for arrays and constructors is missing in libc, but is expected to be committed very soon.

  • May be easier to try to build LLVM libc for mare-metal to see how it comes together. Starting with a semihosting implementation would be easiest for debugging/testing.

  • May be useful to have some demo code in addition to Discourse discussions.

  • tests/integration_tests in libc project use libc own startup code, init is still missing though.

  • Petr: inits and finis exist in compiler_rt so potentially may be reused.

(Johannes) LLVM libc and libc++ for GPUs

  • GPUs will need libraries in the future.

  • These will probably not be standard compliant, but will face the same kinds of issues as embedded libs do.

  • How the above HAL maps to GPUs?

  • No need in startup code, it is already handled or not needed.

  • IO support (e.g. printf can be available), malloc may be available or not, etc.

  • Will need to compile the library code to LLVM IR then LTO it with the user code.

  • There exists a math library, but it is not in upstream since it is not clear where to put it. Option would be to also build libc for GPUs. As a risk this may bind the math library to LLVM project and libs instead of being compatible with LLVM and multiple other options users can be using now.

  • Function definitions in the header files must match the implementation on the device, thus may run into issues if the host header files are different.

  • If headers only were declarations without definitions that would be easier. E.g. the host may have one assembly instruction implementation of some functions that would not work on the device.

  • Example of issue is mismatch of object size on the host and device. GPUs try to match data layout of the host to allow for easy data transfer between host and device.

  • Is this an ABI question?

  • OSes still allow different definitions of say long even on the same actual host hardware.

  • Special headers may be used as overlays over the host headers to solve conflicting definitions.

  • In the original Arm ABI there was a question if it is possible to link objects compiled with different compilers and their own headers, there was a solution with a set of portability macros. Portability is at odds with performance. In reality, this approach was not properly implemented. In practice, most things work except for cases like long jump and other more complex constructs. Peter will try to find and share a link to the relevant ABI document.

  • Users always expect that everything that works on the host would be able to work on the device, this is a difficult expectation to meet. This is why the desire to reuse host headers as much as possible.

libc++

Topics that we want to discuss in the future calls:

  • How to configure libc++ builds to exclude non needed functionality (i.e. unused code to minimize code size)?

  • Would be good to have a size optimized version (vs perf optimizaed now), e.g. separate config for a different trade off for things like string/int conversions.

  • Both libc and libc++: how to distribute them, libc is built from source, libc++ is used as binary - would libc++ source builds be beneficial as well? For embedded especially? Allows fine tuning build options. There are examples to learn from, e.g. Risc-V toolchains (e.g. from Embecosm) are said to be able to compile libraries on demand. Should this question be part of multilib discussion?

  • Build systems: CMake has a set of predefined configs, multibuild generators can be used to build libc++ in multiple configurations.

  • Fuchsia toolchain ships a number of variants of libs, the multilib logic is hardcoded inside the driver. Maybe moved out into a base class to reuse or even an external configuration file.

Is there any progress on “MC/DC” in LLVM?

1 Like

Is there any progress on “MC/DC” in LLVM?

Yes, I am starting to get things together for the review; I apologize that I have been delayed by other things :wink: It will be soon.

-Alan

2022-08-18 Multilib support, LLVM libc, libc++, 16-bit pointers

Participants

  1. Stephen Hines

  2. Stefan Granitz

  3. Simon Wallis

  4. Javid

  5. Michael Jones

  6. Peter Smith

  7. Siva Chandra

  8. Petr Hosek

  9. Nigel Perks

  10. Guillot Tony

  11. Prabhu Rajasekaran

  12. Mikhail Maltsev

  13. Volodymyr Turanskyy

Agenda

  • Multilib support
  • Libc
  • Libc++
  • 16-bit pointers

Discussion

Multilib support (Mikhail)

  • Toolchain class can be subclassed to provide required features.

  • There is a prototype of a plugin - each vendor can implement their own as needed.

  • The idea is to provide a way to load plugins that can match target triples and then implement required logic for multilib support for the target.

  • Next step - implement multilib selection.

  • Support for plugins requires only a small amount of code. Some files may need to be moved out of the implementation folder to make them part of public API.

  • An alternative would be to create a DSL like GCC spec files to transform command line options into library selection options. May need rather tricky logic, thus difficult to design and implement reasonable DSL.

  • Q (Petr): multilib class exists in LLVM - can it be extended? A: This is enough, but each vendor may need to implement their own specific logic. There is no multilib implementation for baremetal Arm now. It may be beneficial to keep this vendor specific code downstream.

  • Q (Petr): what about performance and security of plugin implementations? A: could be an issue, indeed.

  • Peter: plugin DLLs on Windows may be tricky to maintain too.

  • Q (Peter): Would a data driven DSL/config file cover the need, rather than a complex executable DSL? Different targets may need different sets of command line options to take into account.

  • Q (Petr): Should we investigate support for actual (or subset of) GCC spec files? Minimal implementation may be small and may be easier for people to migrate from GCC. Was it not supported on purpose, based on LLVM design philosophy? Yes, there was such a discussion years ago, does it make sense to reopen the discussion again? Config files are becoming more configurable now too, so this may support the argument to look again into spec files support.

  • Mikhail will have a look into spec files option as well as a simple DSL to describe multilibs. Would be good to start a thread on Discourse - we may start with a high level options overview to decide on the direction (without spending much time for multiple prototypes upfront).

  • Q Mikhail: Is it possible to build runtimes with cmake by getting the list of multilibs from clang (like GCC does)? Petr: Not now. Cmake and clang info may not match now - no way to enforce. Fuchsia team is starting to look into this.

Libc

  • Peter: startup code write up on Discourse - to do an informative analysis of approaches.

  • Siva: no much news for now.

  • Stefan: Q: want to use libc for embedded JIT. Can it be compiled for PIC (position independent code)? Siva: yes, PIC should be possible as there should not be anything preventing PIC, otherwise please report an issue.

  • Stefan: The idea is to compile only required functions with JIT and load them on the embedded target. Compiled and linked on the host, then transferred to a device and executed there.

  • Q: Is libc designed with static build in mind or incremental (per function basis) builds are possible? A: If a given function does not depend on global data, that should be possible.

  • Note that there are different PIC models, e.g. PIE and actually PIC. PIE requires data to be at a specific place. There are models when there is a register with base for all the data.

  • Libc does not use virtual functions thus does not have an issue with vtables that needs to be addressed in a PIC model.

  • Libc build instructions are provided at the llvm site: Building libc using the runtimes build setup — The LLVM C Library

Libc++

  • Three main points from last time: building without particular subset of features, code size optimized versions of some features, source vs binary distributions.

  • Would be interesting to have an overview of Embedded C++ Libraries - what special features do they provide and which of them may be relevant to libc++.

16-bit pointers?

  • Q Petr: Armv6 M0+ has 256k addresses only, but requires full pointers that use a lot of space - is it possible to use 16 bit pointers?

  • Peter: No, such an approach can only work for M0+ since even M3 can use megabytes of RAM. One advice is to put all global data into a static struct so that the compiler uses relative addresses.

  • LTO may be an option, however LTO does not play well with section placement in embedded.

  • Literal pool merging can save a bit of size as well.

  • LTO can be a good future topic, it would be nice to invite someone who works on LTO and ThinLTO. Qulcomm did a relevant presentation a few years ago.

Hi Alan, thanks for the feedback!
Is there a way to support?

-Michael

2022-09-15: Dev Meeting round table, LTO, libc initialization

Participants

  1. Teresa Johnson

  2. Todd Snider

  3. Prabhu Rajasekaran

  4. rouxy

  5. Simon Wallis

  6. Michael Jones

  7. Siva Chandra

  8. Daniel Thornburgh

  9. Stephen Hines

  10. Guillot Tony

  11. Petr Hosek

  12. Tue Ly

  13. Pirama

  14. Peter Smith

  15. Simon Butcher

  16. Zoltan Lipp

  17. Volodymyr Turanskyy

Agenda

Discussion

Multilib support

  • No updates.

Round table in LLVM Dev Meeting

  • The LLVM Compiler Infrastructure Project Nov 8-9, 2022

  • Who is going?

    • Siva, Prabhu, Petr, others - would be nice to have a discussion in person.

    • Arm: Volodymyr will check, if anyone is going.

  • Action item for all to highlight to colleagues and suggest to join.

  • Volodymyr to cancel the call in the same week - done, removed from the Google calendar.

LTO for embedded

  • Background - link to slides in the agenda.

  • Recent discussion Linker generated attributes for LTO - does lld already do this?

  • Issues to address:

    • Cannot inline across output sections => add information about output sections to inform the linker.

    • How to place sections at particular addresses => extend IR with information about named sections to pass to LTO and take into account.

    • So changes are likely required in both IR and LLD.

  • There are known downstream implementations, but nothing in upstream yet.

  • Teresa (co-author of ThinLTO):

    • Many changes are not LTO specific - are required in other generic passes.

    • The patches from the 2017 presentation were not published, so need to start from scratch? Or check if anyone from Qualcomm can share the patches.

    • The changes are expected to be accepted upstream without major concerns.

  • Todd (author of the Discrose thread above): Work done at TI, IR is extended to implement mentioned features.

  • Teresa: would be easier to go without IR modification in the linker to keep the existing interface.

  • Todd: would it make sense to have a special “embedded” type of LTO?

  • Teresa/Petr: a recent relevant discussion about fat-lto-objects [RFC] -ffat-lto-objects support + WIP patch âš™ D131618 [WIP][Do NOT review] LLD related changes for -ffat-lto-objects support

  • Arm linker wraps IR into ELF files to make library selection easier, but this seems to be a too specific solution for the Arm Compiler toolchain.

LLVM libc initialization (Peter)

  • Discourse post in the agenda.

  • picolibc approach presented there can be reused for simple configurations and may be a good starting point for LLVM libc enablement.

  • Eventually it would be good to have a sort of template for different types of embedded targets.

  • Siva: Looks good. Now we need to provide the examples.

  • Question: Testing strategy for such code? We could use QEMU.

2022-10-13: libc++, multilib followup

Participants

  1. Siva Chandra

  2. Simon Wallis

  3. Tue Ly

  4. Michael Jones

  5. Nigel Perks

  6. David Spickett

  7. Pirama

  8. Petr Hosek

  9. Peter Smith

  10. Volodymyr Turanskyy

Agenda

  1. Any preparation for the LLVM Dev Meeting (The LLVM Compiler Infrastructure Project Nov 8-9, 2022) round table.
    Reminder: there will be no sync up call in November because of LLVM Dev Meeting happening at the same time.

  2. LLD features for embedded development.

  3. Next steps for the libc++ discussion related to embedded use cases.

  4. Any follow up on the previous topics on libc.

Discussion

LLVM Dev Meeting round table

  • David and Kristof from Arm will attend the LLVM Dev Meeting.

  • No specific preparation for the round table discussed.

LLD embedded features (Peter)

  • Arm team started working on some feature in LLD for embedded use cases:

    • Big endian was supported for AArch64, we are adding support for Arm for completeness.

    • CMSE (Cortex-M Security Extensions, TrustZone M) support for linking code for secure and non-secure worlds.

  • Patches will be upstream soon, reviews are welcome.

libc++ for embedded (Volodymyr)

Looking at existing libraries and guidance targeting embedded use cases, the following are some of the most usual configuration options/needs:

  1. Cross compilation.
  2. Cross testing.
  3. No exceptions.
  4. No RTTI.
  5. No dynamic memory allocation, new/delete must not be used, placement new is allowed.
  6. No IO, including indirect dependencies, e.g. force the demangler not to use any IO.
  7. No locale support (mostly used by IO).
  8. No floating point support.
  9. No other big features like file system.
  10. Options to simplify porting threading to RTOS.
  11. Options to simplify porting clocks to RTOS.
  12. Options to simplify porting to different C libraries.

Other topics:

  1. Code size vs performance hand-optimized implementations.

    • Algorithms may be a good example here.
  2. Multiple small switches vs one embedded configuration that configures everything at once.

    • Individual options seem to be more practical since they allow creating custom configurations.

Discussion

  • Fuchsia already uses multiple versions of libc++ with tweaks like above.

  • Actions? All: Try to implement and upstream such configuration options as people work on toolchains with specific libc++ configurations.

  • Pre-submit builders to check that the options still work are needed. Is this a prerequisite by maintainers anyway?

Follow up on previous topics

Multilib support status

  • The plugin prototype presented by Mikhail was finished, so we know the info needed to make the whole approach work. Next question is what is the best way to define this info.

  • Still need to evaluate DSL options to capture the same info.

  • One of ideas is to have build attributes with partial order defined on them, so that the multilib logic can search and pick the best match. Fuchsia uses a similar idea of “priority” to pick up the best variant.

  • DSL options:

I’ll ask this question here since I’ve mentioned MC/DC in the past and there is interest for the embedded community.

I am ready to put up a review for MC/DC (at long last!), but I’d like to create three individual reviews to make the process more manageable. There are 107 modified files, though the changes themselves aren’t so bad, and 50% of the touched files relate to testing.

However, the patches would need to be pushed together since they really are not independent of each other, and I want to know if there is precedent for that. I know phabricator has a concept of parent/child review, but the implication seems to be that each review patch can be pushed independently once it’s approved.

Also, I’m looking for reviewers who can assist. If you’re interested, let me know!

Thanks!

I recommend posting a separate thread to the list just in case it gets buried at the end of the thread

The most recent threads I can find on code coverage are:

These may be a good source for some authoratitive reviewers.

I’m willing to help for general comments, but I’ve personally no prior experience in the code-coverage parts of LLVM so I fall into the category of someone that might help but won’t be able to approve.

We’re going to have a round table for LLVM Embedded Toolchains Working Group at 2022 LLVM Developers’ Meeting on November 8th at 5:00-5:30pm.

At San Martin - Lower Level ?

Yes, I believe that all roundtables will be in that room.

2022-12-08: Multilibs, LLVM Dev Meeting

Participants

  1. Peter Smith
  2. Michael Platings
  3. Michael
  4. Piotr Przybyla
  5. Rich Fuhler
  6. Simon Butcher
  7. Siva Chandra
  8. Yvan Roux
  9. Prabhu Rajasekaran
  10. Todd Snider
  11. Petr Hosek
  12. Volodymyr Turanskyy

Agenda

  1. Multilibs update.
  2. LLVM Dev Meeting follow up.

Discussion

Multilibs update by Peter Smith

LLVM Dev Meeting follow up

  • Summary by Prabhu:

    • Google, Qualcomm, Nintendo, TI, … participated in the round tables related to embedded toolchains.

    • Downstream linkers → interested in sharing experience. Maybe a topic to follow up in the WG.

    • Security patches from TI may be shared upstream soon.

Related:

  • Nice to discuss embedded specific linker features and convince upstream maintainers they are useful, e.g.

    • Built in compression, e.g. for RW (copied from ROM to RAM and expanded).

    • Place a variable at a specific address, e.g. over a system register or IO ports.

    • When multiple banks of RAM are available, a linker needs a way to distribute segments across.

  • Linker script support in LLD (vs GNU) + support for embedded LTO.

  • Debuggability of linker scripts is not good - more errors/warnings/traces would be useful to understand the choices the linker made.

LLD related reviews:

Next topics for the WG: would be useful to discuss and come up with a set of important linkers features for embedded to start promoting them upstream.

Multilib RFC is here: [RFC] Multilib

2023-01-05: Multilibs, LLD

Participants

  1. Michael Platings

  2. Michael Jones

  3. Nigel Perks

  4. Petr Hosek

  5. Siva Chandra

  6. Volodymyr Turanskyy

Agenda

  1. Multilibs update.

  2. LLD key embedded features.

Discussion

Multilibs by Michael Platings

  • The prototype âš™ D140959 RFC: Multilib prototype and RFC [RFC] Multilib

  • Layering of libraries - a new use case.

  • Petr Hosek on use cases:

    • Fuchsia: existing LLVM multilib implementation is used, no need to have multiple incompatible variants of libraries, mostly used for optimization like with/without exceptions or different ABIs (e.g. with sanitizers - instrumented libraries can be layered on top on non-instrumented as a fallback). Now multilib logic is hardcoded.

    • Pigweed: this is a traditional embedded, the use case is similar to LLVM Embedded Toolchain for Arm.

    • Can we come up with a way to unify these two use cases, even if some migration is needed to converge?

  • One vs multiple include directories: Do we need to rely on sysroots or not?

    • Fuchsia only needs one include directory: libraries use the same API, but different ABIs only.

    • No other issues suggested.

    • Can we have a layered header file includes similar to libraries described above? More specific first, generic then - now it is already used like that for multiarch support in libcxx.

LLD key embedded features

  • Example, picolibc build system needed to be patched recently, because LLD has limitations in placing segments of memory, so we are running into practical issues.

  • There is a list of embedded linker features in the previous meeting minutes.

  • Volodymyr to reach out to LLD maintainer to arrange a discussion in one of the following sync ups.

  • Fuchsia team is comparing GNU LD vs LLD, there some known issues - can start a list in Discourse.

  • There was a discussion in the last LLVM Dev Meeting about LLD as well: diagnostic was mentioned as a major issue.

  • Google summer of code will be coming soon - LLD usability improvements can be a good fit.

  • Github - we can label relevant issues there to make them easy to find.

1 Like