It is vital to keep in perspective that Frame Pointer based stack traces can be imprecise. A previous study shows that about 5-7% stack traces (using FP method; on x86_64 and AArch64) can be incorrect with missing frames (Making sure you're not a bot!). The case of s390x is unique in that some of these issues are further magnified and FP method cannot be used ( Making sure you're not a bot! ).
This distinction is important, but it creates a fundamental problem: .eh_frame cannot be eliminated. It contains essential information for restoring callee-saved registers, LSDA, and personality information needed for debugging (e.g., reading local variables in a coredump) and C++ exception handling. Adopting SFrame would therefore require carrying both formats, resulting in a significant net size increase.
The median .eh_frame size across executables and shared libraries on a Linux system is already 5+% of total VM size. Doubling this overhead to ~10% by adding SFrame on top is simply not viable for most deployments.
In addition, Intel’s 11th Gen and AMD Zen 3 support hardware shadow stack.
A software-only stack walking approach (and remains unvetted for AArch64-see below) that doesn’t replace .eh_frame would quickly become obsolete.
Shadow stack can be enabled per process, providing flexibility to balance performance overhead / memory consumption with profiling needs, even for users who don’t prioritize the security hardening aspect.
While the imprecision of frame pointer-based unwinding is well understood, this doesn’t necessarily make SFrame the right solution. Alternative approaches exist, such as compact unwinding formats that can fall back to DWARF when needed, providing both size efficiency and improved precision. LLVM’s compact unwind format is one such example. Additionally, a straightforward enhancement to frame pointer unwinding—detecting prologue and epilogue patterns by disassembling instructions at the program counter—has not been explored.
On a related note, both the linux-perf-users post and https://developers.redhat.com/articles/2024/10/30/limitations-frame-pointer-unwinding#assembly_code_functions_in_libraries contain a small error: grep -v "[k]" should be grep -v "\[k\]" to properly escape the brackets.
Regarding the frame pointer prologue issue mentioned in the article, I collected my own data while running ninja check-llvm and observed only 1.5% of samples falling into the first 8 bytes of a function—notably lower than the 5-7% figure cited in the study.
I’ve raised concerns about SFrame viability for userspace stack walking: https://lore.kernel.org/linux-perf-users/87h5vg5tvj.fsf@oracle.com/T/#m0075aa5ef423df2f345ae682e9c8e815b06e6085
With the response at https://lore.kernel.org/linux-perf-users/87h5vg5tvj.fsf@oracle.com/T/#m95729a4f826fd045ca125d71ad1af36746c97393, we now know of objections from two of the ten most active contributors to tools/perf. I’ve reviewed other linux-perf-users discussions about SFrame and have not seen preference for it from other top contributors. I believe the size comparison data between SFrame and compact unwind formats could shift perspectives even for those who may have supported SFrame based on earlier, incomplete information. A major glibc maintainer has also expressed concerns about SFrame.
Additionally, I’ve chatted with mobile toolchain developers at the LLVM Dev Mtg, who emphasized that size concerns are especially critical for AArch64, which is heavily deployed on mobile phones. The lack of support from Arm ABI maintainers is also a concern—they are unlikely to want a format known not to work with mobile Linux to coexist with a future, more widely adopted compact format with callee-saved register, LSDA, and personality support.
Completely ignoring sframes for a moment:
If there is anything the proliferation of unwind and stack tracing formats should tell us, it is that every single one comes with a set of compromises, whether thoroughness vs size, loadable vs unloadable, and speed vs total unwinding and callind destructors or size, or whatever other tradeoff may be involved. It is highly unlikely that there is any format that without big compromises on multiple dimensions. The standard “useful for all unwinding and tracing use cases” is unmeetable.
So the fact that sframes may or may not be suitable for mobile or perf is really neither here nor there. No one should expect it to handle every use-case or every situation. Those who find its tradeoffs unpalatable don’t have to turn it on.
If anything, we should evolve it in such a way that it really fills its specific niche extremely well.
The question is simply: “Is the use-case that sframes address sufficiently important to a sufficient number of users to accept about 1,000 new lines of code in LLD?” Observe that the gnu folks have already made up their mind on this, and whatever anyone else’s reservations about the format may be, there are perhaps a dozen companies or so who have said they are interested.
Anyway, I don’t think we are seeing new technical arguments anymore, so this goes on the agenda for the next project council meeting.
This was discussed up thread and I have pointed out why this is not a viable solution for a lot of users. I will try to explain again.
Enabling shadow stack has non trivial negative performance impact. This is inherent in the technology itself – as it introduces memory accesses for every call and return and pollute the caches. The memory overhead can also be a problem for users who care about it – especially when the number of threads is high.
The shadow memory management can also be pain – how much do you need to config to avoid overflow?
It is a great technology, but it has been 10 years since its introduction and has not gained major traction yet for many reasons including the above.
The rest (majority) explained why the simplicity of Sframe is the key to its adoption in kernel. One of the objections you mentioned is about shadow stack – which is not an universal solution (see my reply above).