[RFC] Linux Syscall Cleanup

michaelrj-google · July 8, 2025, 11:55pm

Linux Syscall Cleanup

Currently the handling of linux syscalls is handled inconsistently, leading to confusion. This design intends to unify how linux syscalls are handled and make them safer and easier to use. This design is not intended to touch any other OS interface layers, those will be handled in a separate RFC.

Context

Objective

Linux syscalls should be done in a safe, consistent manner in LLVM-libc. For each syscall, there should be exactly one place where the raw syscall function is invoked.

Background

Currently raw syscalls are used in most places where the syscall is needed. This is mostly in the syscall wrapper functions, but there are other functions that also need to access low level utilities so they also call the raw syscall, as shown in the diagram.

This leads to duplication of logic, since many syscalls actually have multiple possible numbers. For example, mmap can be handled by SYS_mmap or SYS_mmap2. The raw syscall interface also doesn’t have any type checking so each time it’s used is another opportunity for mistakes.

There are also some places where syscalls are called using the public syscall wrapper, as shown in the next diagram.

This requires extra logic to handle errno, and means these functions are not properly independent, making this not an optimal solution either.

For non-syscall functions the expected behavior is to split functionality away from the public interface so it can be shared internally. As an example, the ctype_utils.h header provides functions like isspace both for the public isspace function and for scanf, which treats space characters as separators.

Design

Overview

The proposed design is to move all calls to the “syscall_impl” function into the Linux-specific OSUtil directory (libc/src/__support/OSUtil/linux). Each syscall should be in an “internal wrapper function”, which should be called anywhere this syscall is needed. The diagram below shows the intended dependency chain:

Calls to the internal wrapper function should be done from /linux subdirectories since this is just a wrapper over the syscall. More target-generic interfaces are out of scope for this proposal.

Detailed Design

The internal wrapper functions should take the arguments for a given syscall, dispatch them to the appropriate syscall number, and return its result as an ErrorOr (equivalent to std::expected<T, int>). Syscalls that have variadic arguments should be implemented using non-variadic arguments with default values. An example implementation of the internal openat is shown below, with blue highlighting the differences from a normal syscall wrapper.

These functions should be in headers and marked LIBC_INLINE. One function per header, same organization as in src, so mmap would go in libc/src/__support/OSUtil/linux/sys/mmap/mmap.h and munmap would go in libc/src/__support/OSUtil/linux/sys/mmap/munmap.h.

Alternatives Considered

Next steps:

It should be possible to automate much of the refactoring by using the information from headergen and a list of syscalls. Headergen has a list of the function prototypes for all our syscall wrappers, and from that we can generate the trivial internal syscall wrappers. These will need some cleanup (e.g. fixing variable names, adding alternate syscall names, etc.) but it should be less work than rewriting all of them manually.

Assuming this RFC is approved, I will look into writing a followup design for a syscall wrapper generator (historically called wrappergen).

roland · July 11, 2025, 10:24am

Then why target::name instead of linux::name or something (linux::mmap in the example)?

Can we also see an example with a “result parameter”, e.g. fstat? Will these be kept as pointer arguments as they are in the syscall C signatures, or bundled into a return value struct (as is the generally preferred pattern for return values with std::expected-style “result types”)?

I’m not sure off hand of an example of a syscall that has a not-just-error return value and also a result parameter (this counts only fixed-sized result parameters, not variable-sized buffers). If there are any, would those use a one-off composite return value struct instead (again, as is generally preferred idiom with result types)?

For calls like read/write that take buffer+count pairs, would these still be separate arguments to match the syscall C signatures, or be passed as spans as we usually prefer? (For the void* cases being span<byte> or span<uint8_t>.)

michaelrj-google · July 29, 2025, 10:35pm

That’s a reasonable point. In future we might want to also add posix::name which redirects to linux::name or darwin::name, but that can be determined later.

I think leaving fixed-size result parameters as pointer arguments makes the most sense. For fstat we could in theory create an internal wrapper that initializes its own struct stat and returns that in a result struct, meaning the user only has to pass in filedes. That would make fstat more convenient to use internally, but it’d make the public syscall wrapper copy the data from that internal buffer back into the pointer the user passed. It might also have an observable effect on behavior since the kernel could update buf on error, which we’d have no way of relaying to the user.

// Example of internal fstat that handles its own buf:
namespace internal {

ErrorOr<struct stat> fstat(int filedes) {
    struct stat buf;
    int result = syscall_impl<int>(SYS_fstat, filedes, &buf);
    if (result < 0) 
        return Error(-result);

    return buf;
}

} // namespace internal

// What the external wrapper would have to look like:
int fstat(int filedes, struct stat *buf) {
    ErrorOr<struct stat> result = internal::fstat(filedes);
    if(result.has_error()) {
        libc_errno = result.error();
        return -1;
    }
    *buf = result.value();
    return 0;
}

Overall I’m weakly opposed to returning result parameters, it seems like extra work for whatever syscall wrapper generator we make and I’d guess it’ll have a minor negative effect on performance.

If we do decide to return result parameters and there are multiple things we need to return then yes it should probably be done using a one-off composite return value struct. Given that we can’t come up with any they’re probably rare if they exist.

For buffer+count pairs I think it does make sense to pass spans or string_views since that’s already how file and printf have them, and for the raw syscalls creating a span is effectively zero-cost. We also already need to recognize buffers for msan unpoisoning, so this shouldn’t be too bad for the generator.

Topic		Replies	Views
[RFC] Use `sysret` wrapper on linux C libc	2	160	January 30, 2024
Powerpc Linux 'scv' system call ABI proposal take 2 C	61	492	April 25, 2020
LLVM language atomic and syscall/int instructions LLVM Dev List Archives	2	189	February 11, 2006
Do libc++, or libc++abi, or LLVM issue syscalls directly? C++	0	96	July 6, 2020
How does sanitizer_common_syscalls.inc work? Sanitizers	6	132	November 5, 2024