Hand-written in assembly in libc, setjmp+longjmp

I wanted to continue a discussion that started in D158640 regarding setjmp and longjmp (cc @sivachandra @jrtc27 @AaronBallman @mikhailrgadelha). It was stated that hand-written assembly files are strictly not allowed in the libc project, and it seems the conclusion was that an implementation of setjmp/longjmp using multiple inline asm statements and relying on the file being compiled at a certain optimisation level to generate the desired code was preferable.

I wanted to double-check my understanding of this policy and its aims, as to me it seems like writing out the desired code in a function with the naked attribute would be a better solution (and vs a separate .s file has the advantage that it should work with standard code navigation tooling):

  • setjmp/longjmp are inherently non-portable anyway
  • If the no-asm rule is motivated by security then Iā€™d argue it doesnā€™t help in this case. Youā€™re actively fighting the compiler and itā€™s not clear if the output will implement the desired semantics at a given optimisation level.
  • I donā€™t see the no assembly rule documented and the original libc proposal was less strict, stating ā€œUse source based implementations as far possible rather than assembly. Will try to fix the compiler rather than use assembly language workarounds.ā€

I think the goal of minimising hand-written assembly and leveraging the compiler is a great one, Iā€™m just not sure it makes sense for setjmp and longjmp. But Iā€™m outside the project so I could easily be missing something - thought it might be worth discussing here. Thanks in advance.

3 Likes

FWIW, I agree that I donā€™t think it makes sense to avoid hand-written assembly for setjmp and longjmp; the goal seems reasonable in general but I donā€™t think we should be prescriptive about it. Some parts of the C standard library cannot be implemented in C, I think we should be fine dropping into assembly for those few cases. However, Iā€™m also outside of the project and might be missing things.

1 Like

How is this related to __builtin_longjmp, __builtin_setjmp?

Adding any .s files probably requires significant infrastructure which currently doesnā€™t exist in libc (compiler-rt uses a bunch of macros to make .s files work). I canā€™t see any reason to avoid ā€œnakedā€, though, given the code depends on the calling convention anyway.

__builtin_setjmp/__builtin_longjmp are not really relevant to this discussion. Theyā€™re conceptually similar, but they have a completely different ABI.

1 Like

The goal in general is definitely to avoid hand-written assembly in the libc project. There are many good reasons, for example, we want every line of the libc source code to be static analysis friendly, sanitizer instrumentation friendly etc.

That we want to avoid hand-written assembly in the libc project does not mean we should steadfastedly enforce that rule. That is, we might have to break that rule if there is no way to avoid them. So, an argument can be made that setjmp and longjmp are special enough to break the rule. The argument we are making by using the inline assembly solution is that, with carefully crafted inline assembly and compiler options, we do not need to break the rule even for setjmp and longjmp.

ā€œFighting the compilerā€ is a view point, but I view this as that we are instructing the compiler to not do more than what is asked for in the inline assembly.

As I have said above, the higher level goal is to make the libc source code static analysis friendly, sanitizer friendly, fuzzing friendly etc. That is documented on the libc home page: https://libc.llvm.org/. The ā€œno hand-crafted assembly filesā€ rule is one of the rules we impose to achieve that goal.

Personally, I have strictly followed and enforced the ā€œno hand-crafted assembly filesā€ rule. It is not to imply that we are not flexible, but breaking the rule once makes it is easy to use that as precedence to break it again until it quickly goes out of control.

Technically, the inline-assembly + compiler-flags solution for setjmp/longjmp is probably fragile in principle but is it really fragile practically? Perhaps increased testing can help here. Not that increased testing will tame the compiler, but it can help catch when the assumptions we make in the current approach break. [Increased testing is good anyway even without this problem.] May be supporting a future target architecture becomes impossible with this solution - we can evaluate options again at that point.

Summarizing the points I made:

  1. The ā€œno hand-crafted assembly filesā€ is a rule we want to follow even if we require special compiler flags. The code might by ā€œfragileā€ in principle, but will be safe in practice.
  2. We will reconsider this rule case-by-case if and when there are no options but to break it.
  3. We will add more testing to catch when the assumptions we make break in the case of setjmp/longjmp.
1 Like

I donā€™t understand the argument for avoiding the use of ā€œnakedā€. The functions in question are already basically ā€œnakedā€ functions: they assume the compiler wonā€™t insert any instructions (except a couple instructions in specific places to deal with the return value). You arenā€™t actually gaining anything by avoiding use of the ā€œnakedā€ attribute; youā€™re just making the code more fragile in case some compiler flag makes the compiler generate different code.

For comparison, this is roughly what longjmp looks like with naked:

__attribute((naked))
void longjmp(__jmp_buf * buf, int val) {
asm("lw ra, %0(a0)"::"i"(offsetof(__jmp_buf, __pc)));
asm("lw s0, %0(a0)"::"i"(offsetof(__jmp_buf, __regs[0])));
[...]
asm("seqz a0, a1");
asm("add a0, a0, a1");
asm("ret");
}

This only requires minor syntax changes to the current implementation, generates exactly the same code, and avoids the undefined behavior.

2 Likes

I think there is some misunderstanding. I have viewed this discussion as topic concerning hand-crafted .s files vs inline-assembly in .cpp files. With respect to the naked attribute, I have no objections. In fact, I agree that the naked attribute will improve confidence on what we are trying to achieve here. Also, if you have suggestions on improving the inline assembly, that is also welcome.

I do remember @mikhailrgadelha bringing up the topic of the naked attribute but decided to just use -fomit-frame-pointer + -O3. We can add -DLLVM_LIBC_FUNCTION_ATTR=__attribute__((naked)) to the list of compile-options.

LLVM_LIBC_FUNCTION_ATTR is global, that sounds like a bad idea. Just put the attribute explicitly on the functions in question (with a NAKED macro if you really want) like any normal project.

Also, if your goal is to support GCC, it has much worse support for __attribute__((naked)); it supports it for x86 and RISC-V but not AArch64, unlike Clang which supports all three. So you may still find you need to write it in assembly.

No, its not a global and it specifically gets attached to the implementation of a libc public function: https://github.com/llvm/llvm-project/blob/main/libc/src/__support/common.h#L22. Each public function can be given a different set of attributes using this macro. The choice of the name is may be poor, it was coined in the early days of the libc project.

Yes it is, itā€™s a macro in the global namespace for the whole translation unit.

That is correct, but it gets attached to only a single function.

In general avoiding asm is a wonderful idea.

However, there is a limited set of functions which do weird stuff with stack frames, vfork, setjmp, sigsetjmp, longjmp, and siglongjmp, and the internal __restore_rt. For those few functions, I really think youā€™re much better off using a 100%-asm implementation.

These are so magic, that basically any instrumentation that the compiler might add, like sanitizers, profiling, etc, will break them. And depending on particular compiler options and optimization levels for correctness really seems like a bad idea, especially given the goal of being instrumentation-friendly.

So, then, regarding the two other options: the ā€œnakedā€ attribute on a function with inline-asm, versus an assembly source file. While using the ā€œnakedā€ attribute is theoretically a reasonable answer, I would very much recommend that you NOT use it. Almost the only thing that the ā€˜nakedā€™ attr really gets you is that you donā€™t need to write the couple asm lines .globl and .type. However, the attribute creates a weird hybrid kind of function, which makes it quite fragile, and rife with implementation issues, in all compilers. Unfortunately, using it is asking for problems, and I donā€™t see any particularly interesting benefit.

BTW separately but relatedly, Iā€™d note that itā€™s also unsafe to create any wrapper functions around these, other than longjmp/siglongjmp. Thatā€™s something llvm-libc sometimes does to expose them as public names, right?

The functions you have listed are indeed ā€œmagicalā€ and I do not expect them to be instrumented at all. So, there is no difference from that point of view if we want to compare inline-assembly vs .s files.

As I have said earlier, we understand that doing them all in inline-assembly in .cpp files is fragile in principle but I will be surprised if they are going to be fragile in practice. We have not yet implemented vfork, sigsetjmp, siglongjmp so may be they will be complicated enough that we have to implement in .s files. But, the other three functions are, at least for the target architectures I am familiar with, simple and straightforward to implement. The simplest of them is __restore_rt - all it does is to make a syscall. The compiler options we use to build it are -fomit-frame-pointer -O3 -Wframe-larger-than=0 -Werror. Further, __attribute__((no_sanitize("all"))) is also attached to it.

We donā€™t use any wrapper functions to expose them as public names. We use __asm__(...) magic to add aliases with public names.

Iā€™m going to be blunt here: you have multiple core Clang/LLVM compiler developers telling you this is not a thing that they support or that code should be doing. Please take that on board meaningfully.

1 Like

As a member of the libc team Iā€™m going to throw in my two cents, but first I want to make sure Iā€™m on the same page as everyone else. As I see it, this is the current status:

For the libc project we want to limit assembly to only what is absolutely necessary, and we currently donā€™t have the infrastructure for using full assembly files.

The compiler experts are warning that the current implementation of setjmp/longjmp for RISC-V is fragile and based on undefined behavior.

The setjmp/longjmp functions absolutely must be written in assembly since they deal directly with the CPU.

These functions could be written with the naked attribute to keep the assembly in a CPP file, but the naked attribute is itself somewhat fragile.

From all of this it seems to me that the best current course of action is to move the setjmp/longjmp implementations on RISC-V to use the naked attribute. In the future the libc project will likely need the infrastructure to support full assembly files, but unless that infrastructure is a lot simpler to implement than I think it is (based on the compiler-rt assembly.h) it can wait until we need it.

1 Like

You donā€™t need a whole lot of infrastructure. A minimal asm file for an implementation of a function on Linux x86-64 is just:

.globl NAME
.type NAME, @function
NAME:
.cfi_startproc
	...instructions...
.cfi_endproc
.size NAME, . - NAME

.section	".note.GNU-stack","",@progbits

Depending on how portable to other x86-64 platforms a given asm file needs to be, some of the above will need to be put behind macros which expand to nothing on some of those platforms (for example .size and .type donā€™t exist on Apple platforms, and .note.GNU-stack is only used on linux and a handful of other platforms).

That sort of platform portability is what the compiler-rtā€™s assembly.h provides macros to help with. You can copy those macros as necessary for platforms you support.

2 Likes

A bunch of stuff in libc canā€™t be written in C. Thatā€™s somewhat the point of libc. Itā€™s either compiler intrinsics or asm.

A convenient property to keep is compilation to IR working. Whether thatā€™s inline asm, naked functions, other all seem fine here. Itā€™s being able to construct a libc.bc to pass around which is convenient.

1 Like

Having setjmp/longjmp as bitcode isnā€™t useful though. You could have a naked IR function thatā€™s entirely inline assembly, but what does that achieve?

Having setjmp/longjmp as bitcode isnā€™t useful though. You could have a naked IR function thatā€™s entirely inline assembly, but what does that achieve?

I think there is slight utility in such a representation for the purposes of things like whole-program IR, so that you end up with a definitive definition of the function even if that definition is necessarily a black box for optimization purposes. But this also assumes that pushing for something akin to a pure IR definition of a whole library is desirable in its own right.