[RFC] Identify Func Signature Change in LLVM Compiled Kernel Image

Motivation

BPF is a linux kernel technology which allows to run some custom
codes in kernel. Specially, BPF allows to trace kernel functions.
For most practical kernel func tracing, bpf prog needs to inspect
arguments in order to implement required functionality. There are
a couple of examples like below.

Example 1: A func tracing with kprobe. The link is below
linux/tools/testing/selftests/bpf/progs/loop6.c at master · torvalds/linux · GitHub
The related portion of bpf code looks like below:
SEC(“kprobe/virtqueue_add_sgs”)
int BPF_KPROBE(trace_virtqueue_add_sgs, void *unused, struct scatterlist **sgs,
unsigned int out_sgs, unsigned int in_sgs)
{
struct scatterlist *sgp = NULL;
__u64 length1 = 0, length2 = 0;
unsigned int i, n, len;

  if (config != 0)
	  return 0;

  for (i = 0; (i < VIRTIO_MAX_SGS) && (i < out_sgs); i++) {
	  __sink(out_sgs);
      ...

}
The BPF_KPROBE is a macro which includes prog name and its arguments.
The to-be-traced function is virtqueue_add_sgs(). For kernel tracing,
current common practice is that if func name is not changed, the bpf
prog will use the original func signature as the above example. In
this case, since function virtqueue_add_sgs() exists in kernel
/proc/kallsyms. The original signature is used.

Example 2: A func tracing with fentry. The link is below
linux/tools/testing/selftests/bpf/progs/test_vmlinux.c at master · torvalds/linux · GitHub
The related option of bpf code looks like below:
SEC(“fentry/hrtimer_start_range_ns”)
int BPF_PROG(handle__fentry, struct hrtimer *timer, ktime_t tim, u64 delta_ns,
const enum hrtimer_mode mode)
{
if (tim == MY_TV_NSEC)
fentry_called = true;
return 0;
}
Similar to the above BPF_KPROBE example, BPF_PROG is a macro include bpf prog
name and parameters. The to-be-traced function is hrtimer_start_range_ns().
Since function name is not changed in /proc/kallsyms, the original func
signature is used.
$ cat /proc/kallsyms | grep hrtimer_start_range_ns
ffffffff8134fd90 T __pfx_hrtimer_start_range_ns
ffffffff8134fda0 T hrtimer_start_range_ns
ffffffff84dec2c8 r __ksymtab_hrtimer_start_range_ns

But in clang, it seems possible that function signature might be changed
during optimization without changing function names. Specifically, I found
DeadArgumentElimination and ArgumentPromotion passes may change function
signature without changing func name. For example,
compiling kernel/bpf/syscall.c with additional option ‘-mllvm -debug-only=deadargelim’
which intends to print out some information for DeadArgumentElimination pass.
Eventually, I found three function signature changes:

DeadArgumentEliminationPass - Removing argument 1 (uattr.coerce0) from map_update_elem
DeadArgumentEliminationPass - Removing argument 1 (uattr.coerce0) from map_delete_elem
DeadArgumentEliminationPass - Removing argument 3 (count) from strncpy_from_bpfptr

The following are IR before and after DeadArgumentEliminationPass:

Before:

define internal fastcc i32 @map_update_elem(ptr noundef %attr, ptr nocapture readnone %uattr.coerce0, i8 %uattr.coerce1)
define internal fastcc i32 @map_delete_elem(ptr noundef %attr, ptr nocapture readnone %uattr.coerce0, i8 %uattr.coerce1)
define internal fastcc i64 @strncpy_from_bpfptr(ptr noundef %dst, ptr %src.coerce0, i8 %src.coerce1, i64 noundef %count)

After:

define internal fastcc i32 @map_update_elem(ptr noundef %attr, i8 %uattr.coerce1)
define internal fastcc i32 @map_delete_elem(ptr noundef %attr, i8 %uattr.coerce1)
define internal fastcc i64 @strncpy_from_bpfptr(ptr noundef %dst, ptr %src.coerce0, i8 %src.coerce1)

$ nm kernel/bpf/syscall.o | grep map_update_elem

000000000000b6b0 t map_update_elem
$ nm kernel/bpf/syscall.o | grep map_delete_elem
000000000000bd00 t map_delete_elem
$ nm kernel/bpf/syscall.o | grep strncpy_from_bpfptr
0000000000015d90 t strncpy_from_bpfptr
$

You can see the above function name remains the same in symbol table but actually
the number of arguments have changed.
In such cases, using original func signatures will cause incorrect result.

How existing users deal with this? Users will first use the original signature and
find that the result is not expected. Note that this may cause developers quite some
time since they initially won’t realize it is a signature change issue and they may
try many different ways to debug there code. Eventually they will find the root
cause is the signature change and they will then do llvm-objdump to inspect binary
to find modified signature.

gcc has various suffix like .constprop., .part., .isra. to indicate
func signature having changed, so from the very beginning, users will know the
function has signature change so users can directly go to impact binary stage.

So the eventual goal is to find an easy way to identify whether a func signature
changed or not in clang generated binary.

Proposal

Although we eventually wish to have changed func signature available in the binary,
but the immediate request is to get parity with gcc to have a clear/easy way to
identify a func signature has changed for a particular function.

There are two possible solutions here.

Proposal 1

One is to add suffixes similar to gcc. gcc has suffixes like .constprop, .part, .isra,
etc to let users know compiler optimization has changed function sigature.

There are some concerns that adding suffixes may impact some llvm functionality.
I have checked that SamplePGO won’t be impacted by suffixes from DeadArgumentElimination
and ArgumentPromotion. Memprof is similar to SamplePGO so it won’t be impacted
by suffixes as well.

Snehasish Kumar suggested to have a broader discussion since suffix may
impact some llvm functionality including:

  • when the symbol name changes
    clang backend e.g. -funiq-internal-linkage-name
    LLVM IR - function specialization
    ThinLTO promotion
    Backend - Propeller / FDO based function splitting
  • what do the suffixes imply (clones, modifications, parts etc)
    My opinion: these suffixes means signature change (ie. modifications).
  • how should debuggers and profiling tools treat the symbols
    My investigation: debuggers will try to find the func.
    if user uses ‘func’ in lldb.

Including my above opinion and investigation, It would be great
if llvm community can help answer above questions.

Note that llvm already has some suffixes, e.g. .specialized, .llvm.,
. (full LTO) etc.

Proposal 2

Another is to keep the origin function name but add additional information in the
final binary so tools can easily check whether function signature has changed or not.
In my opinion, dwarf is a good place to add such information since dwarf is already
the standard place to hold various debug and transformation information.

If adding suffixes are hard to llvm, then maybe we can add an attribute in dwarf
for a particular subprogram to indicate its signature has changed?

Some Old Discussions

[1] Function Signature Change Without Obvious Indication · Issue #104678 · llvm/llvm-project · GitHub
[2] [Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgume… by yonghong-song · Pull Request #105742 · llvm/llvm-project · GitHub
[3] [RFC][Transforms][IPO] Add func suffix in ArgumentPromotion and DeadArgumentElimination by yonghong-song · Pull Request #109899 · llvm/llvm-project · GitHub

I’d like to explore proposal 2. Proposal 1 doesn’t address the underlying issue - enabling BPF for functions with changed arguments; it just removes them from the picture. The user experience would be arbitrary: if they are unlucky to want to trace a function with such changes, tough luck. Release to release changes in either kernel, compiler, or both would also result in arbitrary changes to what is traceable[1]

I was curious here if compiling the kernel with those certain passes disabled was an option. I think I understand the kernel deployment scenario you describe, and I want to dig a bit deeper on some details. Specifically, even if today the suffixing patch were to land, for the feature to actually have value, various distros would need to be rebuilt and redeployed to users. I assume the way the distros are rebuilt could involve custom compilation flags, correct? I.e. is there any reason proposal 1 needs to be enabled by default? Also, technically, “different compilation flags” could mean “disable arg promo / dead arg elim” (not pushing for that, just trying to understand what’s missing in this picture). In a sense, though, disabling those passes would give good user experience - any function would be traceable, no arbitrary surprises. Would there be any other negatives?

But excluding having a -fdo-not-change-args hypothetical flag, and going back to the initial 2 alternatives; since there is a gap between right now and when the feature would actually make a difference to users, how off the table would be making changes to BPF infra? What I have in mind is: why not do proposal 2, but in stages:

  • stage 1: have argument-changing passes add function-level metadata (maybe start with it being a bool: args_changed) and create a section in AsmPrinter (or mark in dwarf, like you suggested). Couple this change with the change in BPF to check for that info.
  • stage 2: change the function-level metadata to capture the new signature, propagate that through the final binary; couple with change in BPF to make that new signature available to users.

Stage 1 should be very easy to do in LLVM, and should be relatively easy, I speculate, to do in BPF - it’s just a binary decision, doing the same thing the name change would’ve. Stage 2 is more involved, clearly.

This does require users pick a new BPF drop (or whatever the delivery mechanism is) - this, besides a rebuild of the kernel, but, like I was saying earlier, that’s a sunk cost in all cases.

wdyt?

[1] I don’t know what / how gcc addresses these, but perhaps that’s not as important as this being an opportunity, I think, to do better by the user.

Thanks @mtrofin. I am trying to answer your questions in the above:

First, for distro building kernel. Typically they will use the default compilation flag in the kernel, but distro can certainly add more flags as they wish. But I worried that disabling particular optimization in kernel build will get some resistance from distro and this may actually hurt performance.

Currently, what we want first is to get a clear signal that func signature has changed. A lot of bpf tracing is happened on a host with a particular kernel, e.g. bpftrace (GitHub - bpftrace/bpftrace: High-level tracing language for Linux) and bcc (GitHub - iovisor/bcc: BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more). if the tool knows a signature change, it can signal to the users and then user can inspect dwarf/asm etc. to find more. Although I prefer suffix which will give a clear signal, but dwarf info is okay too. So I agree with your 2-stage proposal.

For ‘stage 1’, if in dwarf, we have an attribute ‘args_changed’ where ‘args_changed = true’ means signature changed. This should be enough for bpf related tools to find out and signal to the user.

Yes ‘stage 2’ to actually capture new signature and encode it in eventual binary (dwarf preferred), it will be more good.

I still not fully understand what are issues for suffix scheme, looks like gcc is using them and fine. Any concrete blockers?

You listed various “issues” in the first post. While we can teach some tools about the suffixes, there will always be users and external tools that we are now confusing. So it always has a cost.
That said, the suffix is a boolean flag and it will never be more than that. Why settle for this, assuming the end goal is to encode the signature (under some conditions)?

The suffix approach is just to make signature change more visible. But I understand upstream concerns and suffix won’t solve the end goal. Yes, the end goal is indeed to encode signature (in all possible cases) and encoding could be in dwarf. I will proceed with that.