Adding clang extension support for BPF CO-RE relocation support

This is related to bpf CO-RE (compile once, run everythere) feature.
The feature has been implemented and merged in Clang. But per John's
suggestion, it is still good to give an explicit reasoning why this
Clang extension should be implemented. This can serve as a reference
point if people in the future wants to understand the reasoning or
touch the implementation and need to discuss. The format below
follows the suggestion in https://clang.llvm.org/get_involved.html.

Evidence of a significant user community

This is related to bpf CO-RE (compile once, run everythere) feature.
The feature has been implemented and merged in Clang. But per John's
suggestion, it is still good to give an explicit reasoning why this
Clang extension should be implemented. This can serve as a reference
point if people in the future wants to understand the reasoning or
touch the implementation and need to discuss. The format below
follows the suggestion in https://clang.llvm.org/get_involved.html.

Thank you for putting this information together and getting an RFC out!

Evidence of a significant user community

The CO-RE feature is to address the issue where the same bpf
program can run across different kernel versions. Note that
kernel internal data structures may change between different
kernel versions. A bpf program targetting one specific kernel
internal data structures often won't work for another kernel.

Before CO-RE, the general approach is bcc ([1]) where
the bpf program is recompiled for *each* kernel. This
incurs large binary size, significant run-time cost
and won't work in many environments (embedded system, container with
limited resource etc.).

CO-RE is proposed to address the above issue. Initial CO-RE patch
permits LLVM to generate relocations for struct/union member field access
and array index. But later as more use cases come up, CO-RE is
enhanced with relocations for field/type existence, type size, bitfield
handling and enum value etc. CO-RE permits the bpf program being compiled once
and then the ELF binary is processed by host bpfloader or host kernel with
properly adjusting kernel data structure accesses in the code based on
relocation information.

CO-RE has been implemented in LLVM and kernel ([2]) and currently
the feature is used virutally by every bpf developer. A few public
posts ([3], [4], [5], [6] and [7]) are added here for reference.

A specific need to reside within the Clang tree

The CO-RE related builtin's and attributes are processed by clang
frontend. The relocation
information is preserved in IR and eventually the relocation is
generated by bpf target.
So CO-RE features are an integral part of the compiler so it is the best to have
feature within the Clang tree.

A specification

The CO-RE feature introduced a few clang extensions include:
  . __builtin_preserve_access_index (initially added, [8])
  . __builtin_preserve_field_info (added later, [9])
  . __attribute__((preserve_access_index)) (added later, [10])
  . __builtin_preserve_type_info (added later, [11])
  . __builtin_preserve_enum_value (added later, [11])

All the above builtin's and attributes are used to record relocations.
The following
are detailed specification:
  . __builtin_preserve_access_index
    defined as
      type __builtin_preserve_access_index(type arg)
    Any record member and array index accesses in the argument will
    have relocations generated.
  . __builtin_preserve_field_info
    defined as
      uint32_t __builtin_preserve_field_info(field_access, flag);
    Depending on flag, the relocation is generated for (1) field size,
(2) whether
    the field exists or not, (3) field signedness, or (4) certain bitfield
    info. Specifically for the field existence case, if the field does not exist
    in the actual host, the bpfloader will resolve the above builtin as return 0
    to indicate the field doesn't exist so bpf verifier will skip this branch.
  . __builtin_preserve_type_info
    defined as
      uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
    Record a relocation for whether the "type" exists or not, or the
"type" size,
    depending on the "flag".
  . __builtin_preserve_enum_value
    defined as
      uint64_t __builtin_preserve_enum_value(*(<enum_type>
*)<enum_value>, flag);
    Record a relocation for whether the "enum_value" (represented a enum name)
    exists or not, or the enum value for the enum name, depending on "flag".
  . __attribute__((preserve_access_index))
    Currently this attribute can be applied to record. If a record has this
    attribute, then any field access for this struct will generate a relocation.

The above builtin's and the attribute are the backbone of CO-RE feature.

Please refer to [3] for more detailed explanation.

Representation within the appropriate governing organization

N/A

What organization determines things like the specification you posted
above, or is there no organization behind these efforts?

A long-term support plan

The feature will be supported for ever.

This is not a long term support plan. :slight_smile: When we have support needs
in the future, will there be people/a company/a community available to
do that work or is the expectation that once this lands, the Clang
community is responsible for it? (This matters with the above question
about the organization responsible for governing the specification --
can the Clang community do as they please here or do we need to
coordinate with others?)

A high-quality implementation

All the above builtin's and attributes are reviewed properly before merging.

Are there other implementations of CO-RE in the wild that we should be
measuring against?

A test suite

All new features are accompanied with necessary test cases.

Are there any external ways we can verify the functionality? (A
conformance suite, some other implementation we can test against, etc)

~Aaron

>
> This is related to bpf CO-RE (compile once, run everythere) feature.
> The feature has been implemented and merged in Clang. But per John's
> suggestion, it is still good to give an explicit reasoning why this
> Clang extension should be implemented. This can serve as a reference
> point if people in the future wants to understand the reasoning or
> touch the implementation and need to discuss. The format below
> follows the suggestion in https://clang.llvm.org/get_involved.html.

Thank you for putting this information together and getting an RFC out!

You are welcome.

> Evidence of a significant user community
> ========================================
>
> The CO-RE feature is to address the issue where the same bpf
> program can run across different kernel versions. Note that
> kernel internal data structures may change between different
> kernel versions. A bpf program targetting one specific kernel
> internal data structures often won't work for another kernel.
>
> Before CO-RE, the general approach is bcc ([1]) where
> the bpf program is recompiled for *each* kernel. This
> incurs large binary size, significant run-time cost
> and won't work in many environments (embedded system, container with
> limited resource etc.).
>
> CO-RE is proposed to address the above issue. Initial CO-RE patch
> permits LLVM to generate relocations for struct/union member field access
> and array index. But later as more use cases come up, CO-RE is
> enhanced with relocations for field/type existence, type size, bitfield
> handling and enum value etc. CO-RE permits the bpf program being compiled once
> and then the ELF binary is processed by host bpfloader or host kernel with
> properly adjusting kernel data structure accesses in the code based on
> relocation information.
>
> CO-RE has been implemented in LLVM and kernel ([2]) and currently
> the feature is used virutally by every bpf developer. A few public
> posts ([3], [4], [5], [6] and [7]) are added here for reference.
>
> A specific need to reside within the Clang tree
> ===============================================
>
> The CO-RE related builtin's and attributes are processed by clang
> frontend. The relocation
> information is preserved in IR and eventually the relocation is
> generated by bpf target.
> So CO-RE features are an integral part of the compiler so it is the best to have
> feature within the Clang tree.
>
> A specification
> ===============
>
> The CO-RE feature introduced a few clang extensions include:
> . __builtin_preserve_access_index (initially added, [8])
> . __builtin_preserve_field_info (added later, [9])
> . __attribute__((preserve_access_index)) (added later, [10])
> . __builtin_preserve_type_info (added later, [11])
> . __builtin_preserve_enum_value (added later, [11])
>
> All the above builtin's and attributes are used to record relocations.
> The following
> are detailed specification:
> . __builtin_preserve_access_index
> defined as
> type __builtin_preserve_access_index(type arg)
> Any record member and array index accesses in the argument will
> have relocations generated.
> . __builtin_preserve_field_info
> defined as
> uint32_t __builtin_preserve_field_info(field_access, flag);
> Depending on flag, the relocation is generated for (1) field size,
> (2) whether
> the field exists or not, (3) field signedness, or (4) certain bitfield
> info. Specifically for the field existence case, if the field does not exist
> in the actual host, the bpfloader will resolve the above builtin as return 0
> to indicate the field doesn't exist so bpf verifier will skip this branch.
> . __builtin_preserve_type_info
> defined as
> uint32_t __builtin_preserve_type_info(*(<type> *)0, flag);
> Record a relocation for whether the "type" exists or not, or the
> "type" size,
> depending on the "flag".
> . __builtin_preserve_enum_value
> defined as
> uint64_t __builtin_preserve_enum_value(*(<enum_type>
> *)<enum_value>, flag);
> Record a relocation for whether the "enum_value" (represented a enum name)
> exists or not, or the enum value for the enum name, depending on "flag".
> . __attribute__((preserve_access_index))
> Currently this attribute can be applied to record. If a record has this
> attribute, then any field access for this struct will generate a relocation.
>
> The above builtin's and the attribute are the backbone of CO-RE feature.
>
> Please refer to [3] for more detailed explanation.
>
> Representation within the appropriate governing organization
> ============================================================
>
> N/A

What organization determines things like the specification you posted
above, or is there no organization behind these efforts?

The BPF foundation (https://ebpf.io/foundation/) is responsible for
the specification. The general discussion (specification, loader
implementation, etc.)
also happened in mailing list https://lore.kernel.org/bpf/.

> A long-term support plan
> ========================
>
> The feature will be supported for ever.

This is not a long term support plan. :slight_smile: When we have support needs
in the future, will there be people/a company/a community available to
do that work or is the expectation that once this lands, the Clang
community is responsible for it? (This matters with the above question
about the organization responsible for governing the specification --
can the Clang community do as they please here or do we need to
coordinate with others?)

The best place probably is to engage with the mailing list
https://lore.kernel.org/bpf/.
People can also reach the bpf foundation for guidance/direction.

> A high-quality implementation
> =============================
>
> All the above builtin's and attributes are reviewed properly before merging.

Are there other implementations of CO-RE in the wild that we should be
measuring against?

No. Currently CO-RE is only available in clang.

> A test suite
> ============
>
> All new features are accompanied with necessary test cases.

Are there any external ways we can verify the functionality? (A
conformance suite, some other implementation we can test against, etc)

Yes, the kernel bpf selftests contains extensive testing for CO-RE.
The following link has the information how to run selftests:
  https://github.com/torvalds/linux/blob/master/Documentation/bpf/bpf_devel_QA.rst
In the prog_tests directory
(https://github.com/torvalds/linux/tree/master/tools/testing/selftests/bpf/prog_tests),
we have the following CORE tests:
  core_autosize.c core_extern.c core_kern.c core_read_macros.c
core_reloc.c core_retro.c

We also have builtbot (based on github workflow) to automatically pull
in latest nightly llvm build
and test it with various versions of bpf loader and the kernel.

This is related to bpf CO-RE (compile once, run everythere) feature.
The feature has been implemented and merged in Clang. But per John's
suggestion, it is still good to give an explicit reasoning why this
Clang extension should be implemented. This can serve as a reference
point if people in the future wants to understand the reasoning or
touch the implementation and need to discuss. The format below
follows the suggestion in https://clang.llvm.org/get_involved.html.

Thanks for making this post.

Evidence of a significant user community

The CO-RE feature is to address the issue where the same bpf
program can run across different kernel versions. Note that
kernel internal data structures may change between different
kernel versions. A bpf program targetting one specific kernel
internal data structures often won't work for another kernel.

Before CO-RE, the general approach is bcc ([1]) where
the bpf program is recompiled for *each* kernel. This
incurs large binary size, significant run-time cost
and won't work in many environments (embedded system, container with
limited resource etc.).

CO-RE is proposed to address the above issue. Initial CO-RE patch
permits LLVM to generate relocations for struct/union member field access
and array index. But later as more use cases come up, CO-RE is
enhanced with relocations for field/type existence, type size, bitfield
handling and enum value etc. CO-RE permits the bpf program being compiled once
and then the ELF binary is processed by host bpfloader or host kernel with
properly adjusting kernel data structure accesses in the code based on
relocation information.

CO-RE has been implemented in LLVM and kernel ([2]) and currently
the feature is used virutally by every bpf developer.

Thanks, this is very helpful.

A specification

The CO-RE feature introduced a few clang extensions include:
  . __builtin_preserve_access_index (initially added, [8])
  . __builtin_preserve_field_info (added later, [9])
  . __attribute__((preserve_access_index)) (added later, [10])
  . __builtin_preserve_type_info (added later, [11])
  . __builtin_preserve_enum_value (added later, [11])

Well, I wish we’d gone through this process earlier, because I really
don’t love a lot of the language design decisions here, starting from
the apparent claiming of “preserve” as the language feature prefix.
But I guess what’s done is done.

Representation within the appropriate governing organization

I’ll take any comments about these sections over to Aaron’s
part of the thread.

John.

Representation within the appropriate governing organization

N/A

What organization determines things like the specification you posted
above, or is there no organization behind these efforts?

The BPF foundation (https://ebpf.io/foundation/) is responsible for
the specification. The general discussion (specification, loader
implementation, etc.)
also happened in mailing list https://lore.kernel.org/bpf/.

Okay, thanks, this is what we’re looking for here.

A long-term support plan

The feature will be supported for ever.

This is not a long term support plan. :slight_smile: When we have support needs
in the future, will there be people/a company/a community available to
do that work or is the expectation that once this lands, the Clang
community is responsible for it? (This matters with the above question
about the organization responsible for governing the specification --
can the Clang community do as they please here or do we need to
coordinate with others?)

The best place probably is to engage with the mailing list
https://lore.kernel.org/bpf/.
People can also reach the bpf foundation for guidance/direction.

Okay. Just so we know, I assume you’re part of the BPF community?
Are you doing this by yourself, or is there a larger institution
supporting your efforts (for example, a company or university)?

I’m sorry to pry, but we actually don’t have any information about
you except a gmail address. :slight_smile:

A high-quality implementation

All the above builtin's and attributes are reviewed properly before merging.

Are there other implementations of CO-RE in the wild that we should be
measuring against?

No. Currently CO-RE is only available in clang.

Okay. That’s fine.

A test suite

All new features are accompanied with necessary test cases.

Are there any external ways we can verify the functionality? (A
conformance suite, some other implementation we can test against, etc)

Yes, the kernel bpf selftests contains extensive testing for CO-RE.
The following link has the information how to run selftests:
  https://github.com/torvalds/linux/blob/master/Documentation/bpf/bpf_devel_QA.rst
In the prog_tests directory
(https://github.com/torvalds/linux/tree/master/tools/testing/selftests/bpf/prog_tests),
we have the following CORE tests:
  core_autosize.c core_extern.c core_kern.c core_read_macros.c
core_reloc.c core_retro.c

We also have builtbot (based on github workflow) to automatically pull
in latest nightly llvm build
and test it with various versions of bpf loader and the kernel.

Great, thanks.

John.

>>> Representation within the appropriate governing organization
>>> ============================================================
>>>
>>> N/A
>>
>> What organization determines things like the specification you posted
>> above, or is there no organization behind these efforts?
>
> The BPF foundation (https://ebpf.io/foundation/) is responsible for
> the specification. The general discussion (specification, loader
> implementation, etc.)
> also happened in mailing list https://lore.kernel.org/bpf/.

Okay, thanks, this is what we’re looking for here.

>>> A long-term support plan
>>> ========================
>>>
>>> The feature will be supported for ever.
>>
>> This is not a long term support plan. :slight_smile: When we have support needs
>> in the future, will there be people/a company/a community available
>> to
>> do that work or is the expectation that once this lands, the Clang
>> community is responsible for it? (This matters with the above
>> question
>> about the organization responsible for governing the specification --
>> can the Clang community do as they please here or do we need to
>> coordinate with others?)
>
> The best place probably is to engage with the mailing list
> https://lore.kernel.org/bpf/.
> People can also reach the bpf foundation for guidance/direction.

Okay. Just so we know, I assume you’re part of the BPF community?

Yes, I am part of BPF community, and also part of
LLVM community as I maintain the clang/llvm BPF backend with
Alexei Starovoitov.

Are you doing this by yourself, or is there a larger institution
supporting your efforts (for example, a company or university)?

I am a software engineer from Meta (former Facebook). I am in
the linux kernel team. Supporting the bpf community is part of my job,
which benefits Meta too.

I’m sorry to pry, but we actually don’t have any information about
you except a gmail address. :slight_smile:

No problem! If the community has any questions regarding to
BPF/BTF support in clang/llvm, they can reach me, Alexei, or
bpf mailing list and we are glad to help.

[...]

>
> >>> Representation within the appropriate governing organization
> >>> ============================================================
> >>>
> >>> N/A
> >>
> >> What organization determines things like the specification you posted
> >> above, or is there no organization behind these efforts?
> >
> > The BPF foundation (https://ebpf.io/foundation/) is responsible for
> > the specification. The general discussion (specification, loader
> > implementation, etc.)
> > also happened in mailing list https://lore.kernel.org/bpf/.
>
> Okay, thanks, this is what we’re looking for here.
>
> >>> A long-term support plan
> >>> ========================
> >>>
> >>> The feature will be supported for ever.
> >>
> >> This is not a long term support plan. :slight_smile: When we have support needs
> >> in the future, will there be people/a company/a community available
> >> to
> >> do that work or is the expectation that once this lands, the Clang
> >> community is responsible for it? (This matters with the above
> >> question
> >> about the organization responsible for governing the specification --
> >> can the Clang community do as they please here or do we need to
> >> coordinate with others?)
> >
> > The best place probably is to engage with the mailing list
> > https://lore.kernel.org/bpf/.
> > People can also reach the bpf foundation for guidance/direction.
>
> Okay. Just so we know, I assume you’re part of the BPF community?

Yes, I am part of BPF community, and also part of
LLVM community as I maintain the clang/llvm BPF backend with
Alexei Starovoitov.

Thank you, that's good information!

> Are you doing this by yourself, or is there a larger institution
> supporting your efforts (for example, a company or university)?

I am a software engineer from Meta (former Facebook). I am in
the linux kernel team. Supporting the bpf community is part of my job,
which benefits Meta too.

Thank you for this as well!

> I’m sorry to pry, but we actually don’t have any information about
> you except a gmail address. :slight_smile:

No problem! If the community has any questions regarding to
BPF/BTF support in clang/llvm, they can reach me, Alexei, or
bpf mailing list and we are glad to help.

Thanks! FWIW, the fact that this is backed by both the Linux kernel
team and has some corporate support from Meta makes me more
comfortable that long-term support isn't likely to cause an undue
burden for us.

~Aaron

Completely agreed.

John.