2026 EuroLLVM Developers' Meeting - Agenda

Hello all,

We are pleased to announce the agenda for the 2026 EuroLLVM Developers’ Meeting in Dublin, Ireland 13-15 April. You can register for the 2026 EuroLLVM Developers’ Meeting HERE (early bird pricing available until 1 March).

The Program Committee led by @mshockwave evaluated 100+ submissions and we thank them for their diligent work in evaluating each talk submission and determining the program (all program committees credited at the bottom of this post).

Keynotes

Capabilities Great and Small: CHERI, CHERIoT, and LLVM
Speaker(s): Owen Anderson

Support for the CHERI capability architecture and its embedded derivative, CHERIoT, is being upstreamed to LLVM. This talk explores what that means for LLVM developers, which parts of the toolchain are affected, and how these changes may interact with existing frontend, optimizer, backend, and linker work. We’ll take a look at work to date, the status of upstreaming, and open problems where community involvement could help.


The Testing Funnel: Validating LLVM at Scale
Speaker(s): Reid Kleckner

LLVM is foundational software, so stakes for making changes to it are very high, and most downstream vendors of LLVM have elaborate testing pipelines to validate all of the properties they care about. However, because these private pipelines are often slow and disconnected from the community, detecting defects is expensive and leads to labor-intensive negotiations over misaligned technical requirements. To improve the experience for both contributors and downstreams, this talk shares lessons from 15 years of LLVM continuous integration experience to advocate for shared upstream CI infrastructure that “shifts left” on testing, to create a better, more harmonious experience for contributors and consumers alike.


Technical Talks

Adding Nullability Checking and Annotations to Many Millions of Lines of Code
Speaker(s): Jan Voung

At Google, our team has been working on reducing null pointer dereference crashes in a huge and diverse C++ codebase by: (a) adopting the Clang nullability annotations and (b) implementing a flow-sensitive intra-procedural dataflow analysis in ClangTidy that verifies that code adheres to the contracts of the annotations. However, to get the most coverage, we need to introduce the annotations to millions of lines of code. To assist with that, we’ve developed an inter-TU annotation inference tool, and added inferred annotations through a “large-scale change”. This talk introduces how the ClangTidy verification and inference tools work, but also discusses the practical experience and the challenges we faced attempting to infer the annotations in “legacy” C++.


Bounds Checking with the Clang Static Analyzer: Improvements and Insights
Speaker(s): Donát Nagy

This talk presents my improvements in the checker security.ArrayBound, which become ready for production use in April 2025. In addition to the results, I will also showcase an educational issue that had plagued the older prototype-quality bounds validator checker; and briefly speak about the planned generalizations that would extend my improvements to other related checkers.


CppInterOp: Interactive C++ as a Service and Advanced Language Interoperability
Speaker(s): Aaron Jomy

CppInterOp is a production-grade C++ interoperability library based on LLVM and Clang that provides compiler-as-a-service capabilities for seamless cross-language integration, leveraging the Clang-REPL interpreter for incremental compilation. Building on years of research in interactive C++ and automatic bindings generation, CppInterOp formalizes a stable, backward-compatible API that enables dynamic languages to harness the full power of modern C++ without sacrificing expressiveness or performance.​


Effective Clang Tidy
Speaker(s): Tom James

This talk explores strategies that have worked well when writing custom clang-tidy checks for Simcenter STAR-CCM+, a complex real-world codebase. It is intended to be intermediate level, so some familiarity with clang-tidy is assumed. In particular, participants should be familiar with writing Clang AST matchers.


Fast-math flags: a bag of issues and a handful of solutions
Speaker(s): Mikołaj Piróg

LLVM’s fast-math flags allow users to specify which floating-point transformations they want to see. Unfortunately, their semantics are both not entirely respected by LLVM and underspecified. This talk covers a number of fast-math issues, an overview of efforts to address them and gives an outlook for the future, more systemic approach to fast-math flags.


Finding Injection Vulnerabilities: Improvements of the Taint Analysis of the Clang Static Analyzer
Speaker(s): Daniel Krupp

Clang Static Analyzer provides a configurable taint analysis checker optin.taint.GenericTaint and a few specialized taint checkers (in the optin.taint group) which can identify potential improper input validation security vulnerabilities. Although promising, the current implementation is still in its early stages, and its limitations prevent it from efficient industrial use. We were able to identify key issues after taking measurements on both the synthetic Juliet test suite and real-world projects. Based on these findings, we propose some improvements to the current solution, which we prototyped and evaluated.


Floating-Point Types in MLIR: Infrastructure, New Types and Dialect Design
Speaker(s): Matthias Springer

This technical talk summarizes recent improvements to MLIR’s floating-point type infrastructure, focusing on how to represent and lower the rapidly growing zoo of low-precision and block-scaled formats used in modern ML workloads.

It introduces the “FloatTypeInterface”, explains the interaction with LLVM’s “APFloat” and “fltSemantics”, and shows step-by-step how to add new floating-point types, from extending “APFloat” to defining lowering rules and dialect design for “special” FP types across high-level and low-level dialects.

The talk also covers the new “arith-to-apfloat” infrastructure for software emulation of low-precision FP arithmetics on CPUs, discusses current limitations of adding FP types without patching LLVM, and outlines future directions for more extensible, vendor-friendly floating-point type systems in MLIR.


Lighthouse: infrastructure for end-to-end MLIR-compilers and testing
Speaker(s): Renato Golin

Last year, a new project was added to the LLVM family: Lighthouse. Its main purpose to guide the development and testing of MLIR based compilers. Like the LLVM test-suite, it should be a common ground for validating upstream assumptions about code, IR, dialects. At the same time, it enables building specific compilers in minutes, using the evolving Python API and MLIR’s Python bindings. In this talk, we’ll show the project main structure, including its components and how to use them to build a simple compiler. We’ll then show the infrastructure that uses those components to validate assumptions in MLIR (canonical forms, invariants, applicability of transforms and passes, correctness tests, etc), and how you can create your own on top of that. Finally, we’ll provide a number of pipeline examples, going from generic PyTorch models to performant execution on various targets.


MLIR-iteration cycle goes brrr: defining ops and rewrites in Python
Speaker(s): Rolf Morel

This is a tutorial on MLIR’s new Python bindings for defining (1) dialects and ops, via an embedded op-definition DSL, as well as (2) writing rewrites (passes, pattern rewrites, and transform ops) that integrate with the existing infrastructure. These features enable a new iteration cycle for developing MLIR compilers: we can now do rapid prototyping of dialects and rewrites in a high-level language without having a compiler in the loop!


Optimising small AArch64 cores: stories from the trenches
Speaker(s): Ties Stuij

Optimizing LLVM for AArch64 has in the past mostly concentrated on big cores which prioritized performance over efficiency. As the market has become more interested in smaller AArch64 cores outside of mobile phones, we have been putting effort into optimizing LLVM for these more constrained cores, which flirt with the embedded space. In this talk we discuss why LLVM has left performance off the table for these smaller cores, and we will give some examples of how we improved this. We will also touch on how benchmarking for these cores differs from benchmarking the bigger AArch64 cores.


Rust or CHERI?
Speaker(s): Edoardo Marangoni

Some Ws of Rust-on-CHERI(oT): what it means, who is doing it, where you can find it and, specially, why we are doing it.


Scaling Certified Instruction Selection For LLVM IR Through Bitblasting
Speaker(s): Sarah Kuhn, Luisa Cicolini, Osman Yaşar

Instruction selection is responsible for turning high-level languages into efficient, reliable machine code. Yet, today’s LLVM backends often introduce subtle bugs through complex optimizing rewrites which are coupled with code generation passes. Fully verified backends like CompCert’s avoid these issues at the cost of heavy, complex manual proofs. We present an LLVM instruction selector verified in Lean, which benefits from a small trusted base, strong automation, and relies on authoritative RISC-V semantics. Using Sail’s new Lean backend, we formalize the RISC-V ISA and automatically verify real LLVM instruction selection and optimization patterns, exploiting Lean’s bitvector library and its verified bitblaster. Our selector achieves performance comparable to LLVM’s GlobalISel (11.9% more cycles estimated with MCA, geomean) while providing machine-checked correctness. This demonstrates that practical, trustworthy verification can scale to modern, rapidly evolving compiler ecosystems.


The LLVM Release Process, a status update
Speaker(s): Tobias Hieta, Cullen Rhodes, Douglas Yung

LLVM’s release process is an ever-evolving chaos machine. Since our last talk on it in 2023, we have shipped six releases and made a number of changes to how we plan, cut, and stabilize a release. This talk is a practical status update on what changed, why it changed, what works well today, and what still creates stress for release managers and contributors. I’ll also introduce our two new Release Managers and close with ideas for where the process should go next, with the goal of fewer surprises when “the release is coming.”


Toward A More Declarative InstCombine: Generalization & Parametric Bitvector Algorithms
Speaker(s): Siddharth Bhat

LLVM contains thousands of bitwidth-dependent rewrites that are hard to maintain and reason about. We introduce new parametric bitvector algorithms that automatically generalize these rewrites across all widths. By applying a mixed unary–binary encoding and finite-state reasoning, our solver lifts concrete LLVM test cases into true width-independent identities, recovering parametric rewrites from LLVM’s test suite that has fixed width rewrites. This moves LLVM toward a declarative InstCombine specification, where rewrite rules are uniform, provably correct, and mechanically derived.


Tracking Warnings at Scale: Extending Clang Diagnostics to Support Issue Baselining and Backslide Prevention
Speaker(s): Usama Hameed

Compiler warnings are an effective tool for developers to catch potential security, compatibility, and correctness problems — however a challenging problem is ensuring that specific warnings are actually enabled across codebases and that diagnostics are addressed over time. This talk describes how we have extended Clang’s diagnostic infrastructure to enable developers to build a policy enforcement system that supports issue baselining (i.e., reporting only newly introduced warnings) and backslide prevention (informing project maintainers when a warning has accidentally been turned off).


What Compiler Implementers and Language Designers Need to Know About Pointer Authentication
Speaker(s): Oliver Hunt

In this talk, which is targeted at compiler implementers and languages designers, we will give a high-level overview of the security benefits of pointer authentication and why LLVM-based compilers should adopt it.


Writing a Formal Execution and Memory Model for Execution Synchronization Primitives on AMD GPUs.
Speaker(s): Pierre van Houtryve

Overview of ongoing efforts to develop and document a formal execution model for the execution synchronization primitives (barriers) of the AMDGPU backend, and how they integrate with the LLVM and AMDGPU target-specific memory models. The talk will also cover the motivation for this work, the benefits for users and developers, and the challenges we faced and are still facing.


clang-reforge: Automatic whole-codebase source code rewriting tool for security hardening
Speaker(s): Jan Korous

We’re building clang-reforge, an automatic source code rewriting tool that enables adoption of bounds-safety in large existing C++ codebases. clang-reforge analyzes source code to identify unsafe pointer operations and capture pointer flow. It replaces built-in pointers with bounds-safe types in pointer flow segments from allocation sites to unsafe operations, such as pointer arithmetic. We have a working internal prototype and we’re now rebuilding it on top of clang’s Scalable Static Analysis Framework.


rocMLIR: High-Performance ML Compilation for AMD GPUs with MLIR
Speaker(s): Pablo Antonio Martinez

This talk presents rocMLIR, a kernel generator for AMD GPUs using MLIR. We present the compilation flow from high-level IR (TOSA and Linalg dialects) to low-level code generation using downstream and upstream MLIR dialects (AMDGPU and ROCDL). We focus on implementing MI300X/MI350X features in MLIR, including double-rate MFMAs, DirectToLDS, and support for MXFP4/FP4 data types. We also cover application-specific optimizations such as SplitK for GEMMs and KV Cache for attention, along with fusion strategies.


Student Technical Talks

Accelerating Pass Order Auto-tuning via Profile-Guided Cost Modeling
Speaker(s): Bingyu Gao

LLVM pass ordering auto-tuning can outperform standard -O3, but it is often hindered by an enormous search space and the high overhead of hundreds of dynamic measurements. This talk presents an efficient auto-tuning framework that minimizes expensive measurements using a profile-guided relative cost model and calibrated beam search. Evaluation on cBench shows an average 10.46% speedup over -O3 with only ​20 dynamic measurements​, significantly accelerating the search for optimal pass sequences.


GPU optimizations, and where Rust knows more than LLVM
Speaker(s): Marcelo Domínguez

In this talk we compare the performance of Rust’s std:offload interface on various benchmarks with C++ OpenMP, CUDA, and ROCm implementations. We show the impact of a new set of LLVM-IR optimizations, and the performance difference between “safe” and “unsafe” Rust. We briefly introduce two aliasing models that are under consideration in the Rust community, and how higher-level Rust alias info can be combined with our lower-level LLVM-IR opt pass.


IR2Vec Python Bindings: Native Integration for Pythonic, ML Workflows
Speaker(s): Nishant Sachdeva, S. Venkatakeerthy

IR2Vec is a widely adopted framework for generating vector embeddings from LLVM IR to enable machine-learning–driven compiler optimizations. This work introduces native Python bindings for IR2Vec using pybind11, enabling seamless and efficient integration with Python-based ML ecosystems such as PyTorch and TensorFlow. By replacing subprocess-based CLI invocation with a direct programmatic interface, the bindings eliminate process overhead, provide inbuilt C++–Python type conversion, and support robust exception handling. The implementation and usage are demonstrated through practical embedding-generation examples. The project is currently under active development for upstream integration into the LLVM monorepo, with multiple pull requests already accepted, and is available in beta form via TestPyPI.


Tutorials

All About Alias Analysis
Speaker(s): Nikita Popov

This tutorial introduces “alias analysis”, which is the fundamental building block for most memory optimizations. The tutorial covers both high-level concepts and usage of alias analysis, as well as important aspects of the alias analysis implementation.


Creating a runtime using the LLVM_ENABLE_RUNTIMES system
Speaker(s): Michael Kruse

The LLVM build system has a mechanism designed for building runtime libraries targeting the platform that compiler’s (be it Clang, Flang, Rust, etc. ) output will run on. For instance, since Clang intrinsically is a cross-compiler, such runtime libraries need to be compiled for each targeted platform. The mechanism originates from splitting target-side runtimes from host-side subprojects such as Clang, Polly, BOLT.

In this tutorial we will create a new runtime to exemplify the working of the often misunderstood LLVM_ENABLE_RUNTIMES system. Few contributors will ever feel the need to create a new runtime from scratch, but understanding it helps identifying and fixing build issues for configuration that no CI is actively testing. The talk will go into details of how multi-stage bootstrapping, cross-compilation, multilib, GPU-offloading, multi-compiler support, and inter-dependent runtimes are designed to work. We also explore configuration options and which corners could still be improved.


HIVM: MLIR Dialect Stack for Ascend NPU Compilation
Speaker(s): Tarasov Vladislav

Huawei Ascend NPUs combine DaVinci AI cores with a rich memory/synchronization hierarchy and, on newer generations, a SIMD+SIMT execution model, making performance-oriented compilation challenging. We present HIVM, an open-source family of MLIR dialects that lowers PyTorch/Inductor → Triton → MLIR (HIVM) → LLVM IR, enabling Ascend-specific optimizations such as layout assignment/propagation, vector intrinsic selection/legalization, and explicit DMA/transfer scheduling with synchronization. The pipeline ultimately targets the BiSheng LLVM-based backend to produce executable code for Ascend chips. The talk walks step-by-step through the key IR levels and transformation passes, serving as a practical baseline for developers building MLIR toolchains for Ascend.


Hands-on Using Clang as a library
Speaker(s): Aaron Jomy, Vipul Cariappa, Vassil Vassilev

This tutorial teaches how to use Clang as a library to build a C++ REPL with incremental compilation by splitting translation units into partial compilation steps. We demonstrate how to create a compiler-as-a-service that enables programmatic instantiation and invocation of C++ template functions from client code. Finally, we integrate these components with the Python runtime to examine practical cross-language interoperability.


Implementing C++26 std::simd with LLVM: A Layered, Compiler-First Approach
Speaker(s): Daniel Towner

The C++26 standard introduces std::simd for portable data parallelism. We present a complete implementation built on LLVM using a layered architecture: a minimal base layer interfaces with LLVM’s SIMD capabilities, while higher-level features build on this foundation. Multi-target support emerges naturally by building upon LLVM’s own architecture support, and a dispatch tag system isolates target-specific code to minimal locations. This design has proven particularly effective for x86, cleanly handling the many ISA variants (SSE, AVX, AVX-512, etc.), and should extend naturally to any SIMD target LLVM supports. As the compiler’s support improves, the library improves too. We’ll share our architectural patterns, performance results, and areas where LLVM could be enhanced to enable better code generation. The authors have been involved in the C++ standardisation process for this the library and our goal is to release it as open source.


Panel

Clang and LLVM in Modern Gaming Platforms
Speaker(s): Tobias Hieta, Felix Klinge, Chris Bieneman, Jeremy Morse, Nicolai Hähnle

A moderated panel with AMD, Intel, Sony, and Microsoft will examine how Clang/LLVM power real-world game production, from platform SDKs and build pipelines to shader compilers and security tooling and identify where upstream collaboration can have the biggest impact.


Quick Talks

Anatomy of Tiling and Vectorizing linalg.pack and linalg.unpack
Speaker(s): Ege Beysel

linalg.pack and linalg.unpack enable explicit data-tiling and layout transformations in MLIR, but their use in data-tiled compilation flows raises subtle questions about alignment, legality, and vectorization. This talk explores how these operations interact with MLIR’s tiling and vectorization infrastructure, focusing on alignment constraints, masking semantics, and performance implications. Using an end-to-end data-tiled matmul example, the talk highlights practical guidance and performance gains for developers building high-performance tensor pipelines.


Apple GPU Support in Mojo
Speaker(s): Kolya Panchenko

Mojo’s powerful compile-time meta-programming system and unified syntax make it exceptionally well-suited for heterogeneous accelerator programming, with proven success on CUDA and ROCm platforms. As many developers work on Apple products (such as iMacs, MacBooks, Mac Studios etc) adding Apple GPU support to Mojo is a natural next step. However, unlike CUDA and ROCm, which have open-source compiler toolchains in upstream LLVM, Apple’s GPU compiler stack is proprietary. We will present chosen design and discuss challenges of Apple GPU’s integration into Mojo.


Attack of the Clones: Speeding Up Coroutine Compilation
Speaker(s): Artem Pianykh

Compiling coroutines with full debug information shouldn’t be dramatically slower than with line tables — but we found CoroSplitPass running over 100x slower, adding minutes to compilation time. The cause traced back to LLVM’s function cloning, where processing debug info metadata was O(Module) rather than O(Function). This talk covers the investigation, the upstream patches, and how the fix ended up benefiting all users of the function cloning API.


Challenges in binary rewriting: enabling BOLT to optimize CFI-hardened binaries
Speaker(s): Gergely Balint

BOLT is increasingly adopted as it can provide additional performance uplift on top of LTO+PGO optimized binaries. At the same time, AArch64 binaries are commonly deployed with Control Flow Integrity features (PAC and BTI) enabled. This creates a practical challenge: until recently, BOLT couldn’t optimize such binaries. It would either crash, or worse: emit incorrect binaries, crashing at runtime. The talk introduces our work on enabling these features, and describes key engineering challenges, including how implementing such features differs from their compiler counterparts.


LLVM JIT — Upcoming Challenges and Opportunities
Speaker(s): Lang Hames

LLVM’s JIT can now run arbitrary real-world applications, as demonstrated by Xcode’s Previews feature. Despite this success, enormous opportunities for improvement remain—especially in performance, memory consumption, tooling, and optimization. This talk will describe the most promising opportunities in these areas, sketch a roadmap for tackling them, and discuss how the community can collaborate to accelerate progress.


Leveraging BOLT to improve data prefetching for AArch64 binaries
Speaker(s): Shanzhi Chen, Wei Wei

The post-link optimizer BOLT has provided a bunch of binary-level optimizations which mostly focus on code layout and effectively reduce front-end stalls in the Top-Down performance analysis view. In addition, we found that BOLT could also be a handy tool to emit prefetching instructions in binaries and to alleviate back-end stalls resulting from cache misses. In this talk, we will cover how to leverage BOLT to improve data prefetching for AArch64 binaries. A new pass is added to BOLT to provide prefetching support for different variations of AArch64 load instructions. And the existing dataflow analysis framework in BOLT is also enabled for AArch64 to provide register liveness information for prefetching addresses. In addition, the ARM SPE-based profiling technique is employed to provide valuable insights into memory operations and to complete the overall profile-guided data prefetching optimization in BOLT.


Mojo Compile-time Interpreter in MLIR
Speaker(s): Weiwei Chen

Mojo supports powerful compile time meta-programming that helps to unlock performance on heterogeneous accelerators by enabling generic abstractions across different targets. Almost any runtime Mojo code can be moved to compile-time to trade for runtime performance while constants evaluated at compile-time can be materialized into runtime values. In this talk, we will dive into the architecture of Mojo’s MLIR based compile-time interpreter which is at the core of materializing generic code into concrete form during compilation. We’ll share implementation insights, performance challenges, and lessons learned, while fostering discussion on building meta-programming compilers with MLIR.


Self-Contained, Target-Specific GEMM Code Generation in MLIR
Speaker(s): Adam Siemieniuk, Renato Golin, Rolf Morel

We present an MLIR-based approach for generating target-specific, highly optimized GEMM kernels that is fully self-contained within the LLVM/MLIR compiler infrastructure and does not rely on external libraries such as LIBXSMM, Intel MKL, or oneDNN. Building on prior work in the TPP-MLIR compiler, we upstream FP32, BF16, and INT8 code-generation techniques into MLIR and propose a transform schedule that combines existing and newly upstreamed passes to lower matmul, batch matmul, and batch-reduce matmul operations into optimized kernels, achieving performance competitive with the LIBXSMM library. As a future work, we plan to leverage auto-tuning techniques to select efficient tile sizes based on hardware characteristics and problem dimensions.


State of Lifetime Safety in Clang
Speaker(s): Utkarsh Saxena

This talk provides a status update on the evolution of Clang’s intra-procedural, flow-sensitive lifetime analysis, building upon the “Origins and Loans” model introduced at the 2025 US LLVM Dev Meet. We outline our strategies for scaling this analysis to Google’s codebase of 1 billion lines of C++, focusing on lifetime annotation adoption through automated inference and targeted suggestions. We share lessons learned from internal rollouts, balancing bug detection against compile-time regressions and false positives. We highlight our future roadmap for subprojects like iterator invalidation and conclude with a call for participation, inviting new contributors to join our efforts in advancing temporal memory safety.


Tracking Operations Through MLIR Pass Pipelines Using Source Locations
Speaker(s): Florian Walbroel

This talk presents a source-location-driven approach for tracking the evolution of MLIR operations across deep pass pipelines. Motivated by real-world optimization work on quantized convolutions in IREE, it shows how preserved source locations can be used to reconstruct operation lineage across IR stages, enabling systematic reasoning about transformation effects. The talk surveys source location semantics under common MLIR transformations and demonstrates a reusable Python-based tool that supports interactive, cross-stage operation tracking for improved debuggability in large MLIR programs.


Lightning Talks

Compact Unwind Information for ELF
Speaker(s): Alexis Engelke

ELF unwind information is encoded as DWARF bytecode for most architectures, which results in a large size overhead and is complex to interpret, precluding its use in e.g. tracing profilers. This talk will present a compact format for asynchronous unwind info that can accurately represent almost all functions generated by LLVM -O3 for x86-64 with a substantially smaller size. We will also discuss portability to other architectures, differences from other unwinding/tracing formats, and interoperability with other toolchains.


Coverage directed codebase reduction for the procedural generation of LIT tests
Speaker(s): Freya Fewtrell

The LIT test suite, whilst extensive, still leaves many parts of the LLVM codebase uncovered. Compiling large real-world C++ code from Sony Interactive Entertainment’s downstream integration test suite routinely exercises code paths that the upstream LIT suite never reaches. To shift this testing leftwards, we have built a tool that uses coverage-directed reduction to automatically turn such high-coverage sources into minimal, self-contained LLVM IR fragments suitable for inclusion as upstream LIT tests. This lightning talk describes how the tool works, the challenges encountered when trying to automate test reduction at scale, and whether increasing coverage alone is sufficient motivation for new upstream LIT tests.


Extending Lifetime Safety: Verification of [[clang:noescape]] annotation
Speaker(s): Abhinav Pradeep

Clang features an intra-procedural, flow-sensitive lifetime analysis designed to catch temporal safety errors like use-after-free, use-after-scope, and use-after-return. The talk presents work on leveraging this analysis to verify [[clang::noescape]] annotations. This effort focuses on applying the “Origins and Loans” model to enforce memory safety guarantees that were previously unverified by the compiler.


Highlighting function names in LLDB backtraces
Speaker(s): Michael Buch

C++ backtraces tend to be hard to read because function names are hidden amongst many layers of namespaces, template arguments and function parameters. In the debugger, a user often wants to simply get a quick overview of the function callstack. But traditionally this has been difficult to decipher. To improve readability, LLDB recently gained the ability to selectively hide or format various parts of C++ function names. This talk describes how we implemented this by extending the LLVM demangler and how other language plugins can take advantage of this infrastructure.


Improving DemandedBits Analysis for Shift Operations in LLVM
Speaker(s): Panagiotis Karouzakis

The DemandedBits analysis is utilized in some optimization passes, such as vectorization and dead code elimination; a similar analysis is employed in InstCombine. We improve DemandedBits reasoning for all basic shift operations, enabling more precise bit-level information propagation. Our improvements reduced code size, enabled additional loop-invariant code motion, and lead to more instruction-level simplifications.


Reproducible Large-Workload Recipes for BOLT with Nixpkgs
Speaker(s): Peter Waller

BOLT needs large, realistic binaries for integration testing, but distributing binaries directly creates governance and supply-chain problems and makes it hard to reproduce the exact toolchains and build flags that shape BOLT behaviour. We show our pinned, auditable Nixpkgs build recipes (including emit-relocs Chromium) and discuss how this enables teams to reproduce identical workloads, generate provenance/SBOMs, and compare results consistently across machines and CI.


STRTAB Hash & Slash: Reducing STRTAB size by hashing its entries
Speaker(s): Vy Nguyen

The string table(STRTAB) accounts for a significant portion of object file overhead at Google, with long mangled names, such as those from Protos, frequently reaching 35% of the total file size. Optimizing this space is the key to faster builds and leaner binaries. In this talk we present an approach to reduce the string table through hashing its entries, demonstrating a reduction in overall binary size.


Unifying Host and GPU Compilation with ClangIR
Speaker(s): Konstantinos Parasyris

Clang’s heterogeneous compilation pipeline separates CPU and GPU code early, limiting cross-target analysis and optimization. This talk presents ongoing work on extending ClangIR (CIR) to enable early merging of host and GPU device code within a unified intermediate representation. By emitting and combining CIR for both targets, we preserve GPU-specific semantics while enabling joint analyses and coordinated optimizations across host–device boundaries. The presentation discusses the required frontend and driver changes and highlights the opportunities unlocked by unified host–device reasoning


Using MLIR Linalg Category Ops for Smarter Compilation
Speaker(s): Javed Absar

Linalg dialect of MLIR recently added category ops as an intermediate abstraction between the two existing forms: named ops and generic ops. A mechanism (-linalg-morph-ops=-to-) to move between these forms was also introduced. When something new appears, adoption is often slow due to existing workflows and lack of awareness. This talk will: (a) Motivate the reason for this additional representation (b) Explain what now exists and how to benefit from it (c) Show how category ops can help certain compilation flows.


What’s new in LLDB on Windows
Speaker(s): Charles Zablit

We’ve been improving LLDB and LLDB-DAP on Windows, adding key features like STDIO support, Unicode handling, better Python integration, and switching to an open-source PDB implementation to bring Windows debugging up to par with other platforms.


Posters

A CPU Autotuning Pipeline for MLIR-IREE
Speaker(s): Chun Lin Huang, Jenq Kuen Lee

We present an autotuning pipeline for IREE’s LLVM-CPU backend that enables Transform Dialect–driven, compile-time multi-level tiling with CPU-specific constraints. In single-dispatch experiments, our constrained tuning flow achieves up to 20% speedup. We also outline next steps toward joint tuning of per-layer sub-FP8 precision variants and tiling using an XGBoost-guided, budgeted evaluation strategy under a quality floor.


A stride towards generating segment accesses in RVV
Speaker(s): Athanasios Kastoras

We present an unconventional way of emitting RVV segment access instructions based on a loop-vectorize pass that emits strided accesses instead of gathers and scatters. We implement a pass that groups consecutive strided accesses, represented as VP intrinsics, and, if feasible, lowers them to RVV intrinsics of segment instructions. Then, we reuse the analysis part of this pass to cost groups of recipes as a single segment instruction, which enables the vectorization of loops that otherwise were going to be deemed unprofitable.


Adding Compilation Metadata To Binaries To Make Disassembly Decidable
Speaker(s): Daniel Engel

Once a program has been compiled into a binary, it is nigh impossible to lift it back into a higher-level representation that is well-suited for analyses, instrumentation, and patching. Disassemblers run into undecidable problems such as “which bytes are instructions?” or “how are the data sections structured?”. Producing a representation that can be recompiled correctly is even harder. Standard debugging formats such as DWARF do not contain enough information to make this task possible. However, at some point during the compilation process, the compiler knew all this information. In this talk, we explore which information can be extracted from the standard ELF format, which information clang can already emit, and which remains inaccessible.


Confirming the Impact of Warning Message Quality in the Clang Static Analyzer
Speaker(s): Kristóf Umann

The Clang Static Analyzer has enjoyed almost 2 decades of industrial adoption, with more and more focus on its usability. Prior research indicated that, among other things, warning message quality is a leading source of dissatisfaction with static analysis tools, which developers of the Clang Static Analyzer tackled but never conclusively proved the benefits of these changes. This talk fills this gap by presenting a method for measuring warning message quality through a human experiment. We sent out surveys in three stages to fine-tune our methodology, with our final one receiving 64 responses from regular static analysis users. We were able to confirm many long-suspected but never confirmed theories circulating among the Clang Static Analyzer contributors: the value of summarizing functions, trimming bug reports, and simplifying low-level code. Based on these results, we also created and landed a bug report improvement, which is available since Clang 19.0.0.


Floating-Point Datapaths in CIRCT via FloPoCo AST Export and flopoco-arith-to-comb lowering
Speaker(s): Louis Ledoux

This work bridges the gap between floating-point arithmetic in MLIR and circuit-level hardware representations in CIRCT. While many accelerators are dominated by floating-point datapaths, existing flows either defer floating-point realization to HLS-oriented dialects or rely on external generators, limiting compiler visibility at the stage where hardware-specific trade-offs are most naturally expressed. The approach restructures FloPoCo to expose arithmetic hardware as explicit combinational graphs and introduces a new MLIR lowering pass that progressively translates floating-point regions into CIRCT-compatible datapaths. Multiple lowering strategies are supported, ranging from IEEE-754–preserving operator mappings to fused and specialized datapaths that reduce rounding, area, and numerical error. As a concrete result, a floating-point kernel extracted from a PyTorch LLaMA layer is compiled end-to-end to a 1.5 mm² chip in a 130 nm process node.


MemorySSA-Based Reaching Definitions for IR2Vec Flow-Aware Embeddings
Speaker(s): Nishant Sachdeva, S. VenkataKeerthy

IR2Vec is a widely adopted framework for generating program embeddings from LLVM IR, supporting both Symbolic and Flow-Aware inference modes. The Flow-Aware mode captures data dependencies by computing reaching definitions over memory operations, but its original implementation relies on a custom control-flow graph traversal, effectively reimplementing analyses already available in LLVM. This work replaces the custom reaching-definitions logic with LLVM’s MemorySSA framework, yielding more semantically rich embeddings while significantly simplifying the implementation. By leveraging MemorySSA’s def-use chains, the new approach correctly handles complex memory behaviors including pointer indirection, loop-carried dependencies, structured data access, and dynamic allocation. Through detailed IR case studies, we demonstrate how the MemorySSA-based design eliminates spurious dependencies and enables richer, more accurate flow-aware embeddings.


Reconstructing Linear Algebra Semantics in LLVM IR
Speaker(s): Mriganka Bezbaruah, Akshay K, Prachi Pandey

Many real-world C/C++ programs still implement linear algebra using explicit loop nests, especially in legacy and autogenerated code, causing compilers like LLVM to miss high-level algebraic intent and lose performance. This poster presents an LLVM-focused approach for identifying linear algebra semantics directly from compiler IR and lowering these loop nests to optimized BLAS calls without requiring source code changes. Using evidence from a production compiler prototype, the work shows that this semantic identification and lowering can recover performance comparable to manual BLAS usage. The poster discusses correctness constraints, design trade-offs, and future directions, highlighting how LLVM can complement existing loop optimizations with library-aware semantic identification.


2026 EuroLLVM Developers’ Meeting Program Committee

Alexis Engelke
Anders Waldenborg
Anupama Chandrasekhar
Chris Jackson
David Spickett
Diana Picus
Divya Shanmughan
Gabor Horvath
Giacomo Castiglioni
Hans Wennborg
Jonas Devlieghere
Kai Nacke
Kunwar Grover
Lorenzo Chelini
Luke Lau
Markus Böck
Min Hsu
Oren Benita Ben Simhon
Paul Kirth
Prabhu Karthikeyan Rajasekaran
Ramkumar Ramachandra
Vassil Vassilev
Volodymyr Turanskyy
Wendi Urribarri

9 Likes