[RFC] Identify Func Signature Change in LLVM Compiled Kernel Image

If the problem is something outside the module needs the calling convention to stay the same, stop marking it static and/or leak the function address, compiler.used might do it.

This will create additional functions in the binary. In this particular case, we only want to record the new signatures (if different from old ones) and do not want to create additional actual functions.

If the problem is dwarf doesn’t reflect a change to the static function and otherwise you’re happy with the change, it’s a bug in llvm.

I am certainly okay to replace the old signature with the new signature. But see above, this may cause different behaviors in lldb.

Previously lldb will show func signature e.g. foo(a, b) from the source code. With new signature, lldb may show func signature like foo(a) which user may confuse since it does not match the source level one. There are some difference as well e.g., original func is foo(a, b) but new change may have foo.llvm.(a).

Another more complicated case. For example, foo(struct a *) in the source may become foo(int t1, int t2). User will definitely confuse what foo(int t1, int t2) mean in lldb since the source level func is foo(struct a *).

@OCHyams has other concerns as well in lldb (see above). @OCHyams what do you think of @JonChesterfield’s suggestion?

BTW, I just created a new pull request ( [RFC][LLVM] Emit dwarf data for changed-signature and new functions by yonghong-song · Pull Request #165310 · llvm/llvm-project · GitHub ) which does not impact lldb and the additional dwarf info should not impact existing tools functionality.

Okay, maybe we can still do

If the problem is dwarf doesn’t reflect a change to the static function and otherwise you’re happy with the change, it’s a bug in llvm.

?

Initially we can have a flag (e.g. -llvm enable-changed-func-dbinfo)? The default will be off. Later on, we may add an official clang level flag to control this?

Yes I agree, but -

I don’t think it’s quite right to call it a bug in LLVM. As discussed above we have the DWARF attribute DW_AT_calling_convention (DW_CC_nocall), which LLVM correctly applies in this case, which indicates to debuggers the function “does not obey standard calling conventions” (i.e., please don’t call it from the debugger).

IMO it’s more that @yonghong-song is looking to introduce a new feature that goes further, describing both the old and new signature and their relationship. The distinction is probably not important for the sentiment of what Jon is saying though.

Initially we can have a flag (e.g. -llvm enable-changed-func-dbinfo)? The default will be off. Later on, we may add an official clang level flag to control this?

Putting the feature behind a flag alleviates a lot of my concerns, as then the change won’t automatically affect all DWARF consumers, and people can try it out locally.

Which version of this are you currently looking to implement?

I wonder if it could be worth discussing your approach on the DWARF-discuss mailing list?

Which version of this are you currently looking to implement?

This version [RFC][LLVM] Emit dwarf data for changed-signature and new functions by yonghong-song · Pull Request #165310 · llvm/llvm-project · GitHub is what I am focusing on. That one will have zero impact on debugger and our gcc colleague David Faust and Jose Marchesi both okay with this version. (see https://github.com/llvm/llvm-project/pull/157349#issuecomment-3458444322) We will go this version as it won’t impact existing dwarf for both llvm and gcc side. We can evolve if we get and agree on a better approach.

I will update [RFC][LLVM] Emit dwarf data for changed-signature and new functions by yonghong-song · Pull Request #165310 · llvm/llvm-project · GitHub where the new dwarf emission will be off by default. I may add additional things during my llvm<->linux_kernel integration.

I wonder if it could be worth discussing your approach on the DWARF-discuss mailing list?

We can do this. Any pointer’s? From https://github.com/llvm/llvm-project/pull/157349#issuecomment-3458444322, it appears dwarf didn’t recommend any format yet for real signatures. If we can get a consensus, that will be great.

1 Like

I think I follow. My understanding is:

  1. We have static functions that are optimised by changing the calling convention
  2. We record that the calling convention is different but not how to call it
  3. In a debugger, this makes branching to that function difficult
  4. We don’t want a copy of the old function kept only for debugging

That suggests a few lines of attack. One is we could extend dwarf to describe what we’ve done.

Another is to keep a stub around with the original calling convention which jumps to the specialised one. That’ll cost a few bytes of machine code but is otherwise dead.

A third is to write new debug info describing the function as if it was written like that in the first place. I.e. change the debug info to reflect where we ended up, not where we started from.

I suspect the proper thing to do is option 3. Make the debug info right. I don’t personally know how to do that so would leave a stub function in the binary. The dwarf folks might have more ideas, worth reaching out.