[RFC] Adding SFrame support to llvm

MaskRay · October 27, 2025, 6:23am

Regarding the technical review questions: Asking for concrete data on performance overhead, memory footprint, and how kernel measurements translate to userspace is a standard part of the review process. These foundational questions help ensure the proposal is well-supported by empirical evidence.

Regarding the performance data discussion: The kernel-space measurements with ORC are valuable, but there’s a critical gap in our understanding. The kernel environment (ORC+omitfp vs FP+no-omitfp) differs fundamentally from userspace, where this RFC proposes SFrame adoption. Direct userspace performance measurements would strengthen the case by avoiding the need to extrapolate kernel savings to a different environment.

On LLD robustness: We should improve testing and maintain high standards for what we upstream.

AutoFDO uses LBR, which has a limited depth (32 on Skylake). Do you use FP for other stack trace requests—such as backtrace() and non-SamplePGO profiling?

I believe there may be some misalignment between expectations and what SFrame realistically offers. A few observations:

linux-perf is waiting for V3
Some developers explored enabling arm64 livepatch with SFrame, but Song Liu has since implemented a frame-pointer-based alternative
No Linux distribution has adeopted SFrame.
I am not the only one questioning the object file format design. As more linker-aware folks become aware of this format, similar concerns are being raised: GNU Tools Cauldron SFrame talk notes https://groups.google.com/g/generic-abi/c/3ZMVJDF79g8
If SFrame is exclusively a kernel-space feature, it could be implemented entirely within objtool – similar to how objtool --link --orc generates ORC info for vmlinux.o. This approach would eliminate the need for any modifications to assemblers and linkers, while allowing SFrame to evolve in any incompatible way.

While SFrame may still have potential to replace the ORC unwinder in the kernel (ORC being a simple format that is considerably larger than .eh_frame), its viability as a stack walking mechanism for userspace programs remains an open question.

From https://maskray.me/blog/2025-10-26-stack-walking-space-and-time-trade-offs

% ~/Dev/bloaty/out/release/bloaty /tmp/out/custom-sframe/bin/clang
    FILE SIZE        VM SIZE
 --------------  --------------
  63.9%  88.0Mi  73.9%  88.0Mi    .text
  11.1%  15.2Mi   0.0%       0    .strtab
   7.2%  9.96Mi   8.4%  9.96Mi    .rodata
   6.4%  8.87Mi   7.5%  8.87Mi    .sframe
   5.1%  7.07Mi   5.9%  7.07Mi    .eh_frame
   2.9%  3.96Mi   0.0%       0    .symtab
   1.4%  1.98Mi   1.7%  1.98Mi    .data.rel.ro
   0.9%  1.23Mi   1.0%  1.23Mi    [LOAD #4 [R]]
   0.7%   999Ki   0.8%   999Ki    .eh_frame_hdr
   0.0%       0   0.5%   614Ki    .bss
   0.2%   294Ki   0.2%   294Ki    .data
   0.0%  23.1Ki   0.0%  23.1Ki    .rela.dyn
   0.0%  8.99Ki   0.0%  8.99Ki    .dynstr
   0.0%  8.77Ki   0.0%  8.77Ki    .dynsym
   0.0%  7.24Ki   0.0%  7.24Ki    .rela.plt
   0.0%  6.73Ki   0.0%       0    [Unmapped]
   0.0%  6.29Ki   0.0%  3.84Ki    [21 Others]
   0.0%  4.84Ki   0.0%  4.84Ki    .plt
   0.0%  3.36Ki   0.0%  3.30Ki    .init_array
   0.0%  2.50Ki   0.0%  2.50Ki    .hash
   0.0%  2.44Ki   0.0%  2.44Ki    .got.plt
 100.0%   137Mi 100.0%   119Mi    TOTAL
% ~/Dev/unwind-info-size-analyzer/eh_size.rb /tmp/out/custom-sframe/bin/clang
clang: sframe=9303875 eh_frame=7408976 eh_frame_hdr=1023004 eh=8431980 sframe/eh_frame=1.2558 sframe/eh=1.1034

The results show that .sframe (8.87 MiB) is approximately 10% larger than the combined size of .eh_frame and .eh_frame_hdr (7.07 + 0.99 = 8.06 MiB).
Since .eh_frame cannot be eliminated (doing so would lead to loss of restoring callee-saved registers, LSDA, and personality information), this size overhead raises significant concerns about the practical viability of this approach.

It’s worth noting that there are existing, battle-tested implementations of a compact unwind format in LLVM, lld/MachO, and libunwind that work with C++ exception handling (in production since 2015 or earlier).
macOS was an early adopter, and OpenVMS appears to use a variant (“VSI OpenVMS Calling Standard”, and an earlier [RFC] Improving compact x86-64 compact unwind descriptors ).
The Apple Compact Unwinding Format: Documented and Explained - Faultlore documents how this works.

This approach allows frames that cannot be described compactly to fall back to DWARF unwinding, which means most DWARF CFI entries can be removed while still maintaining full functionality.
In contrast, today’s SFrame implementation in GNU Assembler would emit many warnings when building llvm-project.

As a concrete example, in a clang executable on macOS (objdump --arch=x86_64 -h), the __text section is 0x4a55470 bytes, while the __unwind_info section is very small at just 0x79060 bytes and __eh_frame at only 0x58 bytes, demonstrating the efficiency of this approach, even if it is only for synchronous.

Regarding AArch64: It would be valuable to gather Arm’s perspective on compact unwind for ELF (Was any form of compact unwind information considered for AArch64? · Issue #344 · ARM-software/abi-aa · GitHub). I’ll ask them. In the meantime, the Mach-O implementation provides a proven baseline for this architecture.

Topic		Replies	Views
LLVM Project Council Meeting - November 19, 2025 LLVM Project	7	866	November 29, 2025
[RFC] Improving compact x86-64 compact unwind descriptors LLVM Dev List Archives	45	1853	December 11, 2025
GCC/LLVM frame pointer incompatibility on ARM LLVM Dev List Archives	21	336	July 19, 2014
LLVM Project Council Meeting - December 10, 2025 LLVM Project	2	292	January 5, 2026
[RFC] Making .eh_frame more linker-friendly LLVM Dev List Archives	13	261	April 30, 2018

[RFC] Adding SFrame support to llvm

Related topics