LLVM Qualification WG sync-ups meeting minutes

Hello everyone,

This thread is dedicated to sharing the meeting minutes of the LLVM Qualification Working Group. We will use this space to publish summaries and action items from our monthly sync-ups, in order to keep the broader community informed.

:spiral_calendar: Meeting notes are initially drafted collaboratively in a shared FramaPad and then archived here after each session for long-term reference and discussion.

:link: Notes (FramaPad): MyPads

:compass: The LLVM Qualification WG was formed following community interest in exploring how LLVM components could be qualified for use in safety-critical domains (e.g. automotive, oil & gas, medical). We welcome contributions from all perspectives: compiler developers, toolchain integrators, users from regulated industries, and others interested in software tool confidence, safety assurance, and systematic quality evidence.

If you’re interested in participating or following along, feel free to join the discussions here or connect via the LLVM Community Discord in the #fusa-qual-wg channel.

Warm regards,
Wendi
(on behalf of the LLVM Qualification WG)

2 Likes

Notes - July 2025

Participants

EU/Asia-friendly (Tuesday 2025/07/01, 5:30PM JST, 1h)

  • Carlos Ramirez (WbyT)
  • Ferdinand Lemaire (WbyT)
  • Jessica Paquette (WbyT)
  • Jorge Pinto Sousa (Critical TechWorks)
  • Mikhail Maltsev (Arm)
  • Oliver Pajonk (Elektrobit)
  • Petar Jovanovic (HTEC)
  • Petter Bertnsson (Arm)
  • Sameera Deshpande (Quadric)
  • Sandeep (Arm)
  • Shivam Gupta (Raincode Labs)
  • YoungJun (NSHC)
  • Wendi Urribarri (WbyT)

Americas-friendly (Wednesday 2025/07/02, 6:00AM JST, 1h)

  • Alan Phipps (Texas Instruments)
  • Allen Miller (Texas Instruments)
  • Chris Apple (self/RTSan owner)
  • Florian Gilcher (Ferrous Systems)
  • John Regehr (University of Utah/Alive2)
  • Lucile Nihlen (Google)
  • Nigel Drago (Quadric)
  • Oscar Slotosch (Validas)
  • Pete LeVasseur (WbyT)
  • Petr Hosek (Google)
  • Todd Snider (Texas Instruments)
  • Wendi Urribarri (WbyT)

Agenda

  • Welcome & Intros
  • Code of Conduct
  • Collaboration format
  • Early topics & activities
  • Focus discussion: Requirements & Traceability
  • Next steps?

Links

Highlights

EU/Asia

  • Approaches to Linking Tests and Requirements: Jessica discussed different methods for associating tests with requirements, such as adding text to existing tests or creating a directory to reference them. She suggested that adding text to the tests might be the most practical initial step. Wendi noted this down as a potential solution.

  • Leveraging Existing Specifications and Tests: Wendi inquired about existing specifications from the C/C++ working group. Jessica mentioned that implicit requirements might already exist in the test directory where clang’s behavior is checked for specific code. Jorge suggested utilizing golden samples as tests and mapping them to requirements using LLVM’s existing testing infrastructure.

  • Command Line Option Testing: Mikhail proposed checking which command line options are used by which tests during test suite runs to ensure all specified options are tested. Wendi asked about typical requirements management practices, noting the need for unique IDs and clear, verifiable descriptions.

  • Requirements Management Tools: Wendi shared links to free and open-source requirements management tools, mentioning Basil. Oliver described a similar tool used for tracing links between requirements, tests, and other elements, capable of generating coverage reports. Jorge offered to investigate in the Eclipse SDV/S-Core group what they use for requirements.

  • Automating Traceability: The discussion addressed automating the mapping between specifications and tests, either within the tests or using a requirements management tool. Oliver described how traceability tools use commands with IDs to link requirements to various artifacts and check for coverage.

  • Scope and Maintenance of Specifications: Wendi raised the question of what should be specified, by whom, and how it can be maintained, initially suggesting a focus on clang and C++. Oliver cautioned that this task should not be underestimated due to the potential workload. Jessica suggested that while specifying every compiler transformation might be difficult, existing tests could catalog behavior, and tools like Alive2 could verify semantic equivalence for certain optimizations. Petar emphasized the potential enormity of the effort required for detailed specification and maintenance.

  • Black Box Testing and Trust: Petar shared experience from qualifying GCC by treating it as a black box and using clang for result comparison. They suggested that a similar approach of extensive testing might be necessary for LLVM, focusing on building trust rather than deep internal analysis. Oliver seconded this, noting the difficulty of qualifying the Linux kernel through code analysis and suggesting the possibility of safety monitors or limiting the scope of usable compiler options. Jorge mentioned that qualified commercial compilers often come with safety manuals and usage guidelines.

  • Qualification of Standard/Runtime Libraries and Linker: Petar inquired about the qualification of the standard library. Jessica suggested working on libraries after addressing clang. Mikhail expressed interest in qualifying open-source runtimes. Wendi noted this point for future discussion.

  • Next Steps and Continued Discussion: Wendi suggested continuing the discussion on the Discord channel or Discourse and encouraged participants to add their thoughts to the notes.

US/Canada

  • Proposed Work Breakdown for Qualification: Wendi presented a suggestion coming from JR Simoes to split the qualification work into three parts: front-end, middle-end, and back-end, with an initial focus on the C/C++ front-end due to its broad use in safety-critical applications. Noted that other languages like Rust and relevant tools could be included later. Key discussion points for confidence in use of the compiler include specifications, testing, formal verification, sanitizers, runtime diagnostics, quality of compiler inputs, known issues analysis, and documentation (user manuals, safety manuals, release notes). The qualification of standard and runtime libraries was also added as a topic based on a suggestion during the EU/Asia session.

  • Challenges in Defining and Tracing Specifications: Wendi highlighted critical questions regarding specifications: whether existing specifications for front-ends like Clang can be reused, how to define partial specifications if none exist, and how to trace specifications with existing open-source verification means, such as the 25,000 tests, to achieve bidirectional traceability. Posed questions about the ownership and maintainability of these specifications. Opinions from the EU/Asia session suggested annotating tests with unique IDs linked to requirements or grouping tests into directories associated with specific requirements.

  • Recommendations for Specification and Test Organization: Florian shared his experience with Rust, suggesting that specifications should reuse as much as possible from existing project documentation and be built in a format conducive to linking, such as HTML. Oscar pointed out that C/C++ benefits from existing ISO standardization documents, so the focus should be on LLVM-specific features rather than creating new language specifications. Both Florian and Oscar agreed that structuring tests in directories that mirror the C standard’s chapters and sub-chapters is a practical and accepted approach for C/C++ compiler qualification, making maintenance easier for both safety-critical and non-safety-critical maintainers.

  • Completeness Argument and Requirements Management Tools: Oscar emphasized the need for a “completeness argument” in qualifying open-source software, explaining that beyond code coverage, it is essential to demonstrate why test cases are sufficient, often by using equivalence classes and programming constructs to define comprehensive test strategies. Wendi inquired about the use of free or open-source requirements management tools. Oscar indicated that he uses a proprietary model for linking functional specifications to test cases.

  • Experiences with Requirements Management Tools: Florian shared his experience, stating that while tools like Sphinx needs are useful for general software and libraries, programming language specifications are too dense for typical requirements writing tools. They opted for a custom Sphinx extension for language specification to test tracing, finding it more suitable than trying to adapt existing tools not designed for this specific task. Pete supported this, noting that Sphinx and Sphinx needs were adopted for the Rust coding guidelines within the safety-critical Rust consortium, finding them useful for building verifications and ensuring traceability.

  • Feature-Based Qualification Structure: Oscar suggested structuring the qualification based on logical features (e.g., language compliance, optimization rules, target-specific features) rather than technical components like front-end, middle-end, and back-end, as this logical structure is more relevant for certifiers who may not understand internal compiler architecture, and that tool qualification is typically a black-box activity. John clarified that the split front-end/middle-end/back-end is driven by the capabilities of existing formal verification tools like Alive2 and their translation validation work, which currently can validate optimizations on the middle-end and back-end but not the front-end. Oscar and Wendi agreed that tools used for tool qualification do not need to be qualified themselves, simplifying the process. Florian expressed interest in separately qualifying linkers, but Oscar argued that qualifying a tool always involves documenting its environment and configuration, making it an integrated process.

Actions

Wendi:

Jorge:

  • Check the Eclipse Safety Core (S-core) for requirements management (definition, traceability) and report back

All:

  • Continue the discussion on next steps in the Discord channel & Discourse
2 Likes

[LLVM Qual WG] arm-tv demo with @regehr

2025/07/31 8:30AM JST

Recording: link

Chat transcript: link

Notes by Gemini :down_arrow:

Summary

@regehr introduced Alive2, a software tool for refinement checking of LLVM optimizations, and the arm-tv tool, developed by his group, for translation validation of ARM 64-bit assembly code, explaining their methodologies and demonstrating their application in bug detection. While arm-tv has found 46 bugs, primarily silent miscompiles, scalability challenges, particularly with memory access, were acknowledged. Some questions were raised about limitations during lifting, tool’s trustworthiness, adding new architectures.

Details

  • Introduction to Alive2 @regnerjr introduced Alive2, a software tool for refinement checking of LLVM optimizations. He explained that LLVM’s middle-end rewrites Intermediate Representation (IR) to improve code, often making it faster or smaller. These transformations are considered “refinements,” meaning the new code’s set of meanings is a subset of the old code’s. Alive2 uses symbolic execution of code before and after optimization and generates questions for the Z3 theorem prover to verify if the optimized code refines the unoptimized code.

  • Alive2 Compiler Explorer @regehr encouraged attendees to try Alive2 via its compiler explorer instance at alive2.llvm.org, noting its ease of use and providing an example problem to explore. He also mentioned that papers have been written about Alive2, but hands-on use is likely more engaging.

  • arm-tv** overview** @regehr presented the arm-tv tool, developed by his group, which performs translation validation for ARM 64-bit assembly code. He demonstrated an LLVM function that uses `memcmp` and showed how the ARM backend optimizes it, including inline substitution of `memcmp` and replacing control flow with a conditional select. The arm-tv tool aims to prove that the assembly code is a faithful translation of the LLVM IR.

  • Translation Validation methodology @regehr explained that translation validation involves assigning a mathematical meaning to the code before and after transformation. Alive2 is used to formally represent the meaning of LLVM functions. For ARM code, arm-tv assigns meaning either by using hand-written instruction semantics derived from the manual or through a mechanically derived version from ARM’s formal description of instructions. The tool then translates the ARM code back into LLVM IR and invokes Alive2 for a refinement check.

  • arm-tv** in action** @regehr demonstrated arm-tv, which is called backend-tv and also supports RISC-V. The tool parses assembly into LLVM MC inst, lifts the ARM assembly code by building a small execution environment that resembles an ARM processor with registers initialized with “freeze poison” (an indeterminate bit pattern), and then processes the lifted instructions. This process results in a clumsy but optimizable function that Alive2 can then efficiently check against the original code.

  • Bug detection with arm-tv @regehr shared that arm-tv has found 46 bugs, primarily silent miscompiles, most of which are in the machine-independent parts of the LLVM backend. He noted that while arm-tv recently started supporting RISC-V, fewer bugs have been found compared to ARM, attributing this to the multi-backend impact of the existing bugs. @regehr mentioned that most bugs were found with the help of fuzzers and an automated testing workflow.

  • Origin and scalability challenges @regehr revealed that the impetus for arm-tv came from a conversation with JF Bastien years ago about trusting LLVM’s top-of-tree for automotive applications. @YoungJunLee inquired about handling large functions more efficiently, to which @regehr acknowledged scalability as a significant weakness of the tool, particularly with memory access, indicating that improvements to Alive2’s memory encoding are needed.

  • Limitations and trustworthiness @uwendi asked about limitations or loss of information during lifting. @regehr explained that while ARM assembly semantics are cleaner, challenges arose in lifting code with powerful pointers to LLVM’s weaker object-offset model, necessitating changes to Alive2’s memory model to support “physical pointers”. He addressed concerns about trusting arm-tv, suggesting documenting the tool’s scope and limitations, with a separate group of people needed to verify its implementation for certification purposes.

  • Tool Usage and Bug Reporting @regehr stated that currently, only his team uses arm-tv. When a bug is reported by the tool, he verifies it on an actual ARM machine to confirm the misbehavior before reporting it to the LLVM developers, ensuring the tool’s output is vetted. He also mentioned the existence of false alarms due to the complexity of the LLVM memory model.

  • Impact on LLVM specification and Future Work @regehr shared an anecdote where arm-tv uncovered an ambiguity in the interaction between the LLVM Lang Ref and the AR64 ABI document, which led to a resolution and fix in LLVM. Regarding future work, he expressed interest in supporting translation validation of inline assembly and concurrency-related aspects of LLVM IR, such as volatile accesses and interrupt handlers in embedded systems.

  • Adding new architectures Luc Forget inquired about the modularity of arm-tv for adding new ISA semantics. @regehr explained that while not “super modular,” refactoring had made it easier to add RISC-V support, and adding a third architecture would likely not be difficult, though Alive2’s lack of multiple address space support remains a limitation for GPU backends. He also highlighted that supporting a new architecture primarily requires a description of its instruction set. @regehr mentioned that for ARM, they can automatically generate the instruction semantics from ARM’s Architecture Specification Language (ASL), but for RISC-V, it was done by hand. He hopes to derive x86-64 semantics automatically in the future, as manual implementation is too extensive.

1 Like

LLVM Qualification Group’s August Sync-Up Agenda

Hi all,

The main topics for the next sync-up are as follows:

Internal process update: proposed changes to membership criteria

(Thanks to @petarj and @etomzak for their inputs)

  • Discussion: Proposed changes to membership criteria to address the current internal process’ inherent challenges with active collaboration and contribution.
  • Action: If possible, please complete the Participant Introduction and Membership Criteria Form before the sync-up.

Clang C/C++ WG insights on conformance to ISO

(Follow-up on the previous discussion regarding specifications and traceability to tests for Clang)

  • Invitees: @Endill and @AaronBallman (to the EU/Asia or Americas-friendly timeslots, depending on their availabilities)
  • Related RFC: https://discourse.llvm.org/t/rfc-c-conformance-test-suite/69821
  • Current Status: Overview of Clang’s test suite (clang/test/cxx) and conformance challenges.
  • Discussion: How these insights impact LLVM Qualification Group’s goals, explore possible steps on creating better traceability and conformance for Clang.

Open Floor

  • Any additional topics, questions, comments, or suggestions from group members.
  • Review action items and assignees.
1 Like

Notes - August 2025

Participants

EU/Asia-friendly (Tuesday 2025/08/05, 5:30PM JST, 1h)

  • Davide Cunial: Interest in Clang-Tidy Qualification

  • Oscar Slotosch: Contribute and Learn about TQ, AI Tool Classification

  • Carlos Ramirez: Tokyo, SW Quality, Safety & Security, phd in human-error centric quality

  • Erik Tomusk: Interest in qualifying GPU Accelerators and use LLVM

  • Petar Jovanovic: Compiler Engineer, Open-Source Enthusiast, Static Analysis Tool

  • YoungJun Lee: Korea, Obfuscation Compiler, Alive2, SAST

  • Wendi Urribarri: Tokyo, Functional Safety Engineer, Formal Methods

Americas-friendly (Wednesday 2025/08/06, 6:00AM JST, 1h)

  • Peter LeVasseur: WbyT

  • Wendi Urribarri: WbyT

  • Vlad Serebrennikov: invitee from the Clang C/C++ Working Group

Agenda

Links

Highlights

EU/Asia

  • Meeting Kick-off and Participant Introductions: as a previous step to discussing membership criteria, introduction of each attendee, sharing background and interest in the LLVM qualification group.

  • AI in Software Development and Qualification: Carlos and Oscar discussed the role of AI in software development, particularly in the context of ISO 26262 compliance. Oscar mentioned a study where AI tools were classified as TCL1 due to uncertainties in their qualification, unlike other tools often classified as TCL3, emphasizing the human ability to detect errors. Carlos expressed skepticism about AI-generated code making it into production for critical software within the next decade due to liability issues and AI’s current limitations in understanding broad code context, which was supported by an experiment showing AI’s failure to recognize dependencies.

  • Accelerators and Safety Critical Spaces: Erik focuses on high-performance computing and runtimes, and bringing this technology to safety-critical spaces like automotive. They clarified that their work involves certified runtime components that depend on LLVM for qualification, positioning themself more as a runtime specialist rather than a compiler expert in this context.

  • Open Source Static Analysis Tools and Legal Challenges: Petar shared their experience in trying to open-source a static analysis tool for automotive standards like MISRA and AUTOSAR. They explained that legal issues, particularly concerning the exact wording of error reports and the reuse of standard parts, prevented the tool’s public release, despite having presented it five years prior. Davide affirmed a similar experience, noting that MISRA and AUTOSAR checks cannot currently be open-sourced, highlighting the legal complexities involved.

  • Discrepancies in Open Source Standards Access: Oscar expressed surprise regarding the difficulties with open-sourcing AUTOSAR-related implementations, as AUTOSAR specifications are freely available, unlike MISRA documents which require payment. Petar clarified that while AUTOSAR standards are free to download, reusing parts of them requires written permission from the consortium, which has been difficult to obtain. This discussion underscored the legal and logistical hurdles in leveraging open-source initiatives for automotive industry standards.

  • LLVM Component Qualification by Validas: Oscar detailed Validas’s experience in qualifying LLVM components, including LLVM-based compilers and clang-tidy. They highlighted the usefulness of clang’s feature to log optimization rules for qualification purposes and also mentioned their qualification kit for clang-tidy, which requires qualifying each rule individually. Additionally, Oscar noted their ongoing process of qualifying the STL template library, having identified and contributed fixes for issues in its implementation.

  • Compiler Optimizations and Safety Concerns: Petar raised a question about “wrong” optimizations in compilers, stating that as a compiler developer, they see nothing inherently wrong with optimizations and that issues are typically bugs, not inherent flaws in optimization. Oscar provided examples of optimizations that can lead to incorrect or unexpected behavior, such as integer overflow issues or deviations in floating-point calculations due to differences in host versus target accuracy. The discussion emphasized the need for careful configuration and understanding of compiler behavior in safety-critical contexts to ensure deterministic output.

  • Managing Known Bugs in Open Source Tools: Oscar discussed the importance of managing known bugs in open-source tools for qualification purposes, noting that the existence of bugs is acceptable as long as workarounds are available. They suggested that improving the classification and mapping of known bugs to specific features would significantly aid in filtering and scanning for relevant issues, making the analysis process faster and easier for developers.

  • Internal Process Changes and Membership Criteria: Wendi briefly introduced a proposal for changes in the group’s internal process, including membership criteria and participation expectations. They shared a link to the detailed description, emphasizing the need for clear expectations regarding contributions and acknowledging the limited time and bandwidth of participants.

  • Valuing Small Contributions: Wendi emphasized the significance of small contributions, stating that even a few minutes or one hour per month dedicated to the group would be meaningful and important. They encouraged attendees to review and comment on the shared document, noting that it was a lightweight version of the security response definition of group composition.

  • RFC Summary and Offline Review: Wendi shared a link to a summary of the main points from an RFC written in April 2023, which is related to Clang conformance. They requested that participants review it offline and share their opinions on Discord rather than the Discourse forum.

US/Canada

  • Proposed Internal Process for the Group: Wendi presented a proposal for a new lightweight internal process to address concerns about group efficiency and the need for a more structured approach. They highlighted the importance of recognizing and respecting members’ limited bandwidth and valuing small contributions, as some members might have mistakenly believed that only full-time commitment was expected.

  • C++ Conformance Testing Challenges: Wendi shared insights from their contact with the Clang C/C++ working group regarding Clang conformance specifications and testability. An RFC from April 2023 indicated that developing a C++ conformance test suite faced resource limitations, preventing any current action despite a good description of how it could be done. A significant hurdle was the licensing issues with test vendors, as they only allowed reporting pass/fail results but not opening tests to analyze failures, making error analysis impossible for open-source use.

  • Clang CXX Directory and Defect Reports: Vlad elaborated on the `clang/test/CXX` directory, noting its two main parts: `DRs` (defect reports) and everything else. They maintained the `DRs` section, which contained about 700 tests for defect reports, far exceeding other implementations. Vlad mentioned that much of the work in this directory, particularly the first 600 defect reports, was done around 2014 by Richard Smith, but progress stopped after that.

  • Challenges with External Conformance Test Suites: Vlad explained that efforts to use external test suites like those from Perennial and Plum Hall in Clang were unsuccessful due to restrictive licensing, which would essentially require these companies to forfeit their business. They also mentioned that some of these test suites were not ideal and could even contain bugs. Wendi confirmed similar issues with SolidSands, stating that it was difficult to use such suites in an open-source context.

  • C++ Standard and Compiler Conformance: Vlad discussed the historical decision not to include many C++ examples in the standard, which created long-term issues for language evolution and caused increasing disagreement among implementations, especially for newer features. They emphasized the RFC’s primary goal: to find a way to write and maintain a test suite that avoids decay, proposing tracking the git repository of the draft to reflect updates to the standard in updated tests. Vlad explained that compilers often do not conform to published standards due to subsequent defect reports and accepted papers, citing the “relaxed template template parameter” debacle as an example.

  • Private Compiler Qualification and Test Suite Quality: Wendi inquired how companies privately qualify their LLVM-based compilers given the lack of proof of conformance to standards. Vlad expressed skepticism about the quality of such private test suites, stating that Clang itself does not claim full conformance due to known unresolved issues, such as the incomplete implementation of the 2019 name lookup paper. Vlad also detailed challenges with “complete class context” rules, where compilers are expected to handle dependencies correctly but often do not due to performance concerns, making it difficult for external parties to fix.

  • Current Status of RFC: Wendi summarized the RFC’s status, confirming with Vlad that nothing had changed regarding the feasibility of in-house effort or external conformance test suites due to resource and licensing issues. Vlad stressed the need for ongoing communication with the core working group to correctly interpret the standard and identify parts considered “garbage” that require rewriting.

  • Safety Critical Rust Consortium: Pete discussed their work with the Safety Critical Rust Consortium, which aims to identify and address gaps in the Rust ecosystem and language for safety-critical applications. They explained that the consortium seeks to enable Rust’s use in more safety-critical industries and at higher safety criticality levels. Vlad raised concerns about the completeness of Rust’s specifications, noting that the specification for name lookup was unclear. Pete acknowledged that Rust, as a less mature language, had gaps in its documentation, but efforts were underway to improve the reference and FLS documents.

Actions

  • All participants: review description of the proposal of internal process update, share thoughts on the Discord channel, and add comments or modify directly in the FramaPad for improvement.

  • All participants: review the Clang conformance summary and send feedback and questions on the Discord channel. The Clang C/C++ WG members are open to answer to our questions and concerns.

  • Wendi: try to find a conversation with Robert C. Seacord about Plum Hall’s test suite and update Vlad.

  • Wendi: plan contacting Plum Hall.

1 Like

LLVM Qualification Group’s September Sync-Up Agenda

Hi all, hope you’re having a great summer!

For those who filled in the Participant Introduction & Membership form and indicated interest in being active contributors (Q3): our next sync-up is planned for next week (@petarj @CarlosAndresRamirez @evodius96 @petbernt @slotosch @YoungJunLee @ZakyHermawan).

Ahead of the call, I’d like to invite you to drop a quick message on Discord about the offline reviews we talked about last time (see also the minutes :memo:):

Additional quick topics for the agenda :thought_balloon: :

  • Introduction of @ZakyHermawan

  • Concerns or viewpoints about meeting transcriptions / AI summaries (Gemini)

  • @slotosch’s proposal for the LLVM Conference in Santa Clara

  • Eclipse SDV’s interest in the LLVM open qualification initiative + invitation to their community meetup in Japan

  • Poster at Innovations in Compiler Technology

  • Insights from a conversation with an ELISA project member on resources & funding

Given that time is short, I may also create separate Discord threads to keep these discussions moving more efficiently.

Thanks again to everyone who answered the form. @etomzak You’re warmly welcome in our calls, even if your availability is limited.

:pushpin: Small note: at the moment, @evodius96 is officially the only member from US/Canada time zones. @PLeVasseur is interested and expected to join sync-ups, so just letting you know for context.

Hi Wendi, due to travel I won’t be able to attend this upcoming call. If there is a better time for everyone, please don’t hesitate to make a change. Thank you!

1 Like

Hi @evodius96, thanks for letting me know! Since you won’t be able to attend, I’ll cancel the upcoming call. We’ll keep the EU/Asia sync-up as the main source of updates this time, so I’d kindly ask you to have a look at the minutes afterward to stay in the loop. Looking forward to catching up with you in a future call once you’re back from your travels. Safe travels!

Handling non-technical topics asynchronously

Hi all,

@petarj @CarlosAndresRamirez @petbernt @slotosch @YoungJunLee @capitan-davide

cc: @evodius96 @PLeVasseur @etomzak @ZakyHermawan

For our upcoming sync-up (tomorrow), we have more items on the agenda than we can realistically cover in one hour. Here’s a draft of the presentation (the final version will be uploaded to GitHub after the sync-up):

To make sure we use our meeting time efficiently, and to give everyone a fair chance to contribute, I’d like to suggest that we handle some of the non-technical topics asynchronously on our Discord channel.

Topics for discussion in Discord:

By shifting these items to Discord, we’ll free up the sync-up call to focus on technical discussions (e.g. directions for a grey-box approach, tool usage confidence, evaluation of development processes).

Outcomes from Discord discussions will also be summarized here in our meeting minutes on Discourse so nothing is lost. Looking forward to your thoughts and contributions on Discord! :folded_hands:

Notes - September 2025

Participants

EU/Asia-friendly (Tuesday 2025/09/02, 5:30PM JST, 1h)

  • Carlos Ramirez (host)
  • Davide Cunial
  • Erik Tomusk
  • Florian Gilcher
  • Jorge Sousa
  • JosĂ© Rui Simoes
  • Oscar Slotosch
  • Petter Berntsson
  • Vlad Serebrennikov
  • Wendi Urribarri (co-host - note taking & check time)
  • YoungJun Lee
  • Zaky Hermawan

Americas-friendly (Wednesday 2025/09/03, 6:00AM JST, 1h)

Cancelled - See https://discourse.llvm.org/t/llvm-qualification-wg-sync-ups-meeting-minutes/87148/8

Agenda

Refer to https://discourse.llvm.org/t/llvm-qualification-wg-sync-ups-meeting-minutes/87148/6

Links

Highlights

Non-technical topics

About note taking

  • No shared concerns from “core members”
  • One shared concern about AI writing down every word (from a non-member)
  • Gemini not enabled today

New self-nomination through the Google Form - Zaky’s presentation

  • EE student from Indonesia
  • Coming as an individual
  • Working with ISO/SAE 21434 (cybersecurity)

Oscar’s idea for the US LLVM 2025 conference (end of October)

  • Proposal to have a corner about compiler qualification at the exhibition for sponsors
  • Discuss with people and attract interest on it
  • No conclusion about this point, to be taken for discussion to Discord

Technical topics

Wrap-up about direction and focus of the discussions since July

Reference functional safety standard

  • Members from several industries (automotive, trains, robots, etc), so different functional safety standards apply
  • General framework for functional safety of E/EE/PE systems is IEC 61508, so makes sense to use it as first guidance
  • As IEC 61508 is parent of other functional safety standards, the expectations around tool confidence are very similar

Need to provide evidence of tool usage

  • Three questions from the safety standards (see slides)
  • If answers Yes - Yes - No, then there is a need to provide the evidence
  • Comments about question 3:
    • Most safety standards are written for users, so it depends on how much they examine the “relevant outputs”
    • In the case of a compiler, relevant output → final executable
    • More and more difficult to thoroughly verify the final executable (complexity)
    • Many of the tools that are traditionally used by vendors are closed, some open tools that can be used to check the relevant outputs

First target: Clang compiler

  • As a tool provider of Clang, we don’t know what will be the usage
  • As a tool user, you can restrict yourself (for example, using it only for debugging, not for mass-production)
  • Users will need to rely on the compiler depending on the usage
  • All the C++ parsing and semantic analysis is done by the Clang frontend
  • Language + Standard => Version changes are fast
  • Which flavor of C or C++?
    • C++ spec improves significantly
    • C spec is more rigorous
  • Suggestion of small scope:
    • Limit to the lexer? Spec wise, it is more simple
    • Opinion 1: use of restricting to lexer is limited; from safety point of view, trust the lexer but what about the rest; requirements and association to what use cases
    • Opinion 2: agree, need of a valid use case for the lexer

About effort for a conformance test suite:

  • Opinion 1: Amount of effort would be huge even for 1 version
  • Opinion 2: Testing is laborious but not very hard
  • Opinion 3: If you want to do a good conformance test, bottleneck is interpretation of the standard; testing specification against C/C++ is not as with Rust
  • Comment: commercial test suites are expensive, 40-45K Euro to qualify only one version of a compiler
  • What is generated is version dependent
  • About usage of Alive2:
    • Replace the Clang front-end with Alive2 front-end and generate Alive2 IR from source code?
    • Clarification: alivecc doesn’t replace Clang itself; it simply adds a pass plugin for verification at the IR transformation stage

Grey-box approach

  • Qualification is typically black-box activity
  • Disadvantage: to be done for every combination, optimization options, etc
  • Grey-box approach could be useful, but one limitation is lack of specification of intermediate I/O
  • Example: specification of the IR
  • Identification of regressions in IR could be useful

Possibility of LTS?

  • From this RFC, this will not happen - https://discourse.llvm.org/t/rfc-llvm-lts/84049
  • Labor can be massive
  • Very difficult interpretation of what an LTS is
  • In Rust community an LTS is 2 years
  • Rather do qualification work incrementally on top of the main version
  • Which C++ flavor to support is a big question

Example of funding

  • Fleet of students
  • Reasonable budget
  • Example: University of Romania
  • “Top leadership” needed to guide the students

Selection of qualification methods

  • ISO 26262 proposes four qualification methods
  • Evaluation of the tool dev process is highly recommended only for ASIL A and ASIL B
  • Not to be used alone without Validation so to cover ASIL C and ASIL D
  • Clarification:
    • Proposal is not about using Validation or Evaluation of dev process alone
    • Have a mix of both to cover all safety integrity levels with at least one highly recommended method
    • Many tool vendors already use a mix of these two methods for “certification“: validation by the vendor + audit by a certifying body

Actions

Wendi :

  • share summary of topics with the group and the community
  • point out the possible ways to proceed
  • create threads for each subject on Discord? (easier to communicate)

All : participate in the open discussions (preferred on Discord, but Discourse is also fine)

1 Like

Just a quick update: I’ve submitted a PR to update the documentation and add links to the August and September 2025 sync-up slide decks, which helped guide our recent discussions:

:link: https://github.com/llvm/llvm-project/pull/156897

The slides are currently hosted in llvm-project/docs/qual-wg/slides, but following feedback, I plan to migrate them to a more appropriate location (likely llvm-www) once confirmed with the community. Please feel free to check the PR for details, and let me know if you have any feedback!

LLVM Qualification Group’s October Sync-Up Agenda

:spiral_calendar: Calendar: Getting Involved — LLVM 22.0.0git documentation

Non-technical Topics

  • Docs updates (September) – summary of recent GitHub changes for LLVM Qualification Group Docs: #156897, #157804, #156184, #158842, #160021, #161113

  • New members (September) – welcome to @sousajo-cc @jr-simoes @ZakyHermawan :tada:

  • Decision-taking in the WG (requested by @slotosch) – discussion on how we define consensus, use votes, and set time limits for open topics

Technical Topics

  • Upstream efforts & action plan (small deliverables) – build an initial roadmap based on the “confidence in the use of software tools” workflow

  • Tutorial / Introduction (proposed by @YoungJunLee) – outline for newcomer materials

  • Qualification focus areas (proposed by @petbernt) – first candidate areas and lightweight templates

Looking forward to seeing everyone at the October sync-up and continuing to shape our next steps together.