My comments are below.
Sam
From: Y Song ys114321@gmail.com
Sent: Tuesday, November 12, 2019 10:15 AM
To: cfe-dev@lists.llvm.org
Cc: Andrii Nakryiko andrii.nakryiko@gmail.com; Alexei Starovoitov alexei.starovoitov@gmail.com; richard@metafoo.co.uk; Liu, Yaxun (Sam) Yaxun.Liu@amd.com
Subject: Re: Two questions about address_space attribute
[CAUTION: External Email]
Hi, Richard Smith and Yaxun Liu,
You probably have expertise on the two questions below related to address space. Could you comment on my questions below?
Thanks!
Yonghong
Hi,
Currently, I am thinking to add address_space attribute support for
x86_64 and BPF to compile linux kernel. The main goal is to get
address_space information into debug_info and then later to dwarf/BTF.
But address_space enforcement itself can also help with better code
for the kernel.
The RFC patch is here:
https://reviews.llvm.org/D69393
During experiments, we found two issues which I would like to get
expert opinion:
issue 1: address_space attributes on function pointers
clang does not allow address_space attribute for function pointers,
specifically, in clang/lib/Sema/SemaType.cpp,
// ISO/IEC TR 18037 S5.3 (amending C99 6.7.3): “A function type shall
not be // qualified by an address-space qualifier.”
if (Type->isFunctionType()) {
S.Diag(Attr.getLoc(), diag::err_attribute_address_function_type);
Attr.setInvalid();
return;
}
But linux kernel tries to annotate signal handling function pointer
with address space to indicate it is accessing user space.
typedef __signalfn_t __user *__sighandler_t; typedef __restorefn_t
__user *__sigrestore_t;
Such attribute makes sense for linux since indeed the signal handler
code resides in user space and the kernel pointer merely points to
user memory here.
What is the rationale for this restriction in standard? How we could
make clang tolerate such restrictions for X86/BPF/…?
I can’t actually speak for the C committee, but I’ll try to answer anyway. As I understand it, there’s two main things.
The first is that data pointers may already be in a different address space from function pointers. Standard C does not actually allow function pointers to be converted to data pointers and vice-versa; that’s an extension, albeit a basically universal one and indeed one mandated by POSIX because of dlsym
. Harvard architectures put code in a separate address space, which is often significantly larger than the address space for data, and C supports that. So while it’s not abstractly illogical for the code space to be split into address spaces, the idea of applying the same address spaces to both data and function pointers isn’t totally aligned with C.
The second is just that the traditional hardware-supported uses of address spaces are centered around data, not code. A processor might have different instructions to load/store from different address spaces, and handling those different address spaces correctly and efficiently is important to expose in a systems programming language. While processors often also have different instructions for e.g. near/far calls, this is rarely useful to expose in the programming language, since it usually only comes up in e.g. the userspace/kernelspace boundary, which is so important and fiddly that’s it’s usually written in assembly anyway.
I think we may allow function to be in a non-default address space. The issue is that there may be lots of places in clang or llvm assuming function in default address space, which needs to be fixed.
Yes, I think LLVM might currently assume a single address space for all code. I would want to see signs of progress on that front before agreeing to an extension of address spaces to function pointer types.
John.