support __attribute__((weak)) for non-primitive types

Hi,

This is a BPF use case. We would like to check whether
__attribute__((weak)) support for non-primitive types would be
possible in clang or not. For example, if we have two types like
   struct t { int a; } __attribute__((weak));
   struct t { int a; };

   typedef unsigned __u32;
   typedef unsigned __u32 __attribute__((weak));
the compilation should not fail. It should just ignore weak definitions.

In BPF, to support CO-RE (compile once and run everywhere), we
generate a *generic* vmlinux.h
(BPF Portability and CO-RE · BPF)
which is included in the bpf program. The bpf loader (libbpf) will
adjust field offsets properly based on host record layout so the same
program can run on different kernels which may have different record
layouts.

But vmlinux.h does not support macros (and simple static inline helper
functions in kernel header files). Currently, users need to define
these macros and static inline functions explicitly in their bpf
program. What we are thinking is to permit users to include kernel
header files containing these macros/static inline functions. But this
may introduce duplicated types as these types have been defined in
vmlinux.h.

For example, we may have:
    socket.h:
        struct socket { ... };
        #define TCP 6
    vmlinux.h:
        struct socket { ... };
    bpf.c:
        #include <socket.h>
        #include <vmlinux.h>
        ... TCP ...

In this case, struct "socket" is defined twice. If we can have something like
    bpf.c:
      #pragma clang attribute push (__attribute__((weak)), apply_to = record)
      #include <socket.h>
      #pragma clang attribute pop
      #include <vmlinux.h>
        ... TCP ...

Then bpf program can get header file macro definitions without copying
explicitly.

We need support for "apply_to = typedef" in the above pragma, so we
can deal with typedef types as well.

Another issue is related to what if two types are not compatible. For example,
    struct t { int a; } __attribute__((weak));
    struct t { int a; int b; };
I think we could have a flag to control whether we allow incompatible
weak type or not.

It would be good if I can get some suggestions/directions on whether
such a feature is possible or not for clang frontend.

Thanks,

Yonghong

Hi,

This is a BPF use case. We would like to check whether
__attribute__((weak)) support for non-primitive types would be
possible in clang or not. For example, if we have two types like
   struct t { int a; } __attribute__((weak));
   struct t { int a; };

   typedef unsigned __u32;
   typedef unsigned __u32 __attribute__((weak));
the compilation should not fail. It should just ignore weak definitions.

In BPF, to support CO-RE (compile once and run everywhere), we
generate a *generic* vmlinux.h
(BPF Portability and CO-RE · BPF)
which is included in the bpf program. The bpf loader (libbpf) will
adjust field offsets properly based on host record layout so the same
program can run on different kernels which may have different record
layouts.

But vmlinux.h does not support macros (and simple static inline helper
functions in kernel header files). Currently, users need to define
these macros and static inline functions explicitly in their bpf
program. What we are thinking is to permit users to include kernel
header files containing these macros/static inline functions. But this
may introduce duplicated types as these types have been defined in
vmlinux.h.

For example, we may have:
    socket.h:
        struct socket { ... };
        #define TCP 6
    vmlinux.h:
        struct socket { ... };
    bpf.c:
        #include <socket.h>
        #include <vmlinux.h>
        ... TCP ...

In this case, struct "socket" is defined twice. If we can have something like
    bpf.c:
      #pragma clang attribute push (__attribute__((weak)), apply_to = record)
      #include <socket.h>
      #pragma clang attribute pop
      #include <vmlinux.h>
        ... TCP ...

Then bpf program can get header file macro definitions without copying
explicitly.

We need support for "apply_to = typedef" in the above pragma, so we
can deal with typedef types as well.

Another issue is related to what if two types are not compatible. For example,
    struct t { int a; } __attribute__((weak));
    struct t { int a; int b; };
I think we could have a flag to control whether we allow incompatible
weak type or not.

It would be good if I can get some suggestions/directions on whether
such a feature is possible or not for clang frontend.

One concern I have is that __attribute__((weak)) originally came from
GCC and I don't believe GCC supports this functionality. Are you
floating this idea by them as well to see if they're on board with it?

As I understand it, this attribute is more about symbols that have
been exported and how the linker interacts with them. Does the same
concept apply here to type declarations? If not, that may suggest we
should use a separate attribute so as not to confuse the semantics.

~Aaron

>
> Hi,
>
> This is a BPF use case. We would like to check whether
> __attribute__((weak)) support for non-primitive types would be
> possible in clang or not. For example, if we have two types like
> struct t { int a; } __attribute__((weak));
> struct t { int a; };
>
> typedef unsigned __u32;
> typedef unsigned __u32 __attribute__((weak));
> the compilation should not fail. It should just ignore weak definitions.
>
> In BPF, to support CO-RE (compile once and run everywhere), we
> generate a *generic* vmlinux.h
> (BPF Portability and CO-RE · BPF)
> which is included in the bpf program. The bpf loader (libbpf) will
> adjust field offsets properly based on host record layout so the same
> program can run on different kernels which may have different record
> layouts.
>
> But vmlinux.h does not support macros (and simple static inline helper
> functions in kernel header files). Currently, users need to define
> these macros and static inline functions explicitly in their bpf
> program. What we are thinking is to permit users to include kernel
> header files containing these macros/static inline functions. But this
> may introduce duplicated types as these types have been defined in
> vmlinux.h.
>
> For example, we may have:
> socket.h:
> struct socket { ... };
> #define TCP 6
> vmlinux.h:
> struct socket { ... };
> bpf.c:
> #include <socket.h>
> #include <vmlinux.h>
> ... TCP ...
>
> In this case, struct "socket" is defined twice. If we can have something like
> bpf.c:
> #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> #include <socket.h>
> #pragma clang attribute pop
> #include <vmlinux.h>
> ... TCP ...
>
>
> Then bpf program can get header file macro definitions without copying
> explicitly.
>
> We need support for "apply_to = typedef" in the above pragma, so we
> can deal with typedef types as well.
>
> Another issue is related to what if two types are not compatible. For example,
> struct t { int a; } __attribute__((weak));
> struct t { int a; int b; };
> I think we could have a flag to control whether we allow incompatible
> weak type or not.
>
> It would be good if I can get some suggestions/directions on whether
> such a feature is possible or not for clang frontend.

One concern I have is that __attribute__((weak)) originally came from
GCC and I don't believe GCC supports this functionality. Are you
floating this idea by them as well to see if they're on board with it?

I haven't talked to gcc community about this. Since this is to be
implemented in clang, we would like to get opinion from clang community
first.

Right, gcc doesn't support such a functionality.

As I understand it, this attribute is more about symbols that have
been exported and how the linker interacts with them. Does the same
concept apply here to type declarations? If not, that may suggest we

You are right here. Currently weak attribute applies to global
variable and function
declarations and the attribute is mostly used by the linker.

should use a separate attribute so as not to confuse the semantics.

How about __attribute__((weaktype))?
It will clearly indicate this is a "weak" style attribute but for "type".

A similar name "weakref" with the same naming convention is used by gcc/clang
to indicate a weak reference declaration.

I think it would be preferable to come up with a new attribute rather than overloading __attribute__((weak)) for this purpose.

The weak attribute doesn’t seem to apply to functions in the way you seem to have in mind for aggregates:
int attribute((weak)) f() { return 1;}
int f() { return 2; } // error, redef

I think it would be preferable to come up with a new attribute rather than overloading `__attribute__((weak))` for this purpose.

The weak attribute doesn't seem to apply to functions in the way you seem to have in mind for aggregates:
  int __attribute__((weak)) f() { return 1;}
  int f() { return 2; } // error, redef

Right, I propose to use __attribute__((weaktype)) to indicate the
attribute is for types.
Please let me know if you have any other suggestions.

Hi,

This is a BPF use case. We would like to check whether
__attribute__((weak)) support for non-primitive types would be
possible in clang or not. For example, if we have two types like
   struct t { int a; } __attribute__((weak));
   struct t { int a; };

   typedef unsigned __u32;
   typedef unsigned __u32 __attribute__((weak));
the compilation should not fail. It should just ignore weak definitions.

In BPF, to support CO-RE (compile once and run everywhere), we
generate a *generic* vmlinux.h
(BPF Portability and CO-RE · BPF)
which is included in the bpf program. The bpf loader (libbpf) will
adjust field offsets properly based on host record layout so the same
program can run on different kernels which may have different record
layouts.

But vmlinux.h does not support macros (and simple static inline helper
functions in kernel header files). Currently, users need to define
these macros and static inline functions explicitly in their bpf
program. What we are thinking is to permit users to include kernel
header files containing these macros/static inline functions. But this
may introduce duplicated types as these types have been defined in
vmlinux.h.

For example, we may have:
    socket.h:
        struct socket { ... };
        #define TCP 6
    vmlinux.h:
        struct socket { ... };
    bpf.c:
        #include <socket.h>
        #include <vmlinux.h>
        ... TCP ...

In this case, struct "socket" is defined twice. If we can have something like
    bpf.c:
      #pragma clang attribute push (__attribute__((weak)), apply_to = record)
      #include <socket.h>
      #pragma clang attribute pop
      #include <vmlinux.h>
        ... TCP ...

Then bpf program can get header file macro definitions without copying
explicitly.

We need support for "apply_to = typedef" in the above pragma, so we
can deal with typedef types as well.

Another issue is related to what if two types are not compatible. For example,
    struct t { int a; } __attribute__((weak));
    struct t { int a; int b; };
I think we could have a flag to control whether we allow incompatible
weak type or not.

What's the use case for allowing incompatible weak types? Having
multiple distinct types seems like it will make for confusion. e.g.,

struct foo { int a; };
struct __attribute__((whatever_we_name_it)) foo { float f; };

int func(struct foo *f) {
  return f->f + f->a;
}

Which lookup causes the error (or do they succeed somehow)?

It would be good if I can get some suggestions/directions on whether
such a feature is possible or not for clang frontend.

What happens if the only definition seen in a TU is marked as being weak?

In terms of incompatible weak types, are these incompatible?

struct base {
  int a;
};

struct foo : base {
  int b;
};

struct __attribute__((whatever_we_name_it)) foo {
  int a, b;
}

I'm not comfortable with the idea of having two different type
definitions in the same TU unless it's clear and consistent which one
"wins". If the type definitions are the same between the types, then I
don't see a problem because that limits the chances for surprises.
When the type definitions differ, it becomes harder to reason about
the way lookup works, how the type is laid out at runtime, what the
ABI implications are, etc. Is there a consistent rule for which type
is the one to be honored (e.g., only one definition is allowed to
elide the attribute and that's the canonical definition of the type
which always "wins")?

~Aaron

>
> Hi,
>
> This is a BPF use case. We would like to check whether
> __attribute__((weak)) support for non-primitive types would be
> possible in clang or not. For example, if we have two types like
> struct t { int a; } __attribute__((weak));
> struct t { int a; };
>
> typedef unsigned __u32;
> typedef unsigned __u32 __attribute__((weak));
> the compilation should not fail. It should just ignore weak definitions.
>
> In BPF, to support CO-RE (compile once and run everywhere), we
> generate a *generic* vmlinux.h
> (BPF Portability and CO-RE · BPF)
> which is included in the bpf program. The bpf loader (libbpf) will
> adjust field offsets properly based on host record layout so the same
> program can run on different kernels which may have different record
> layouts.
>
> But vmlinux.h does not support macros (and simple static inline helper
> functions in kernel header files). Currently, users need to define
> these macros and static inline functions explicitly in their bpf
> program. What we are thinking is to permit users to include kernel
> header files containing these macros/static inline functions. But this
> may introduce duplicated types as these types have been defined in
> vmlinux.h.
>
> For example, we may have:
> socket.h:
> struct socket { ... };
> #define TCP 6
> vmlinux.h:
> struct socket { ... };
> bpf.c:
> #include <socket.h>
> #include <vmlinux.h>
> ... TCP ...
>
> In this case, struct "socket" is defined twice. If we can have something like
> bpf.c:
> #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> #include <socket.h>
> #pragma clang attribute pop
> #include <vmlinux.h>
> ... TCP ...
>
>
> Then bpf program can get header file macro definitions without copying
> explicitly.
>
> We need support for "apply_to = typedef" in the above pragma, so we
> can deal with typedef types as well.
>
> Another issue is related to what if two types are not compatible. For example,
> struct t { int a; } __attribute__((weak));
> struct t { int a; int b; };
> I think we could have a flag to control whether we allow incompatible
> weak type or not.

What's the use case for allowing incompatible weak types? Having

In our use case, we are talking about incompatible types between
   . vmlinux.h, generated with a particular kernel config which tries to
     contains all fields so it can be used for different kernel builds
   . kernel header files on a particular kernel build which may have
     less fields than vmlinux.

The below is a concrete example.
In linux kernel internal header files, many struct definition depends on
build time configuration. For example, struct "task_struct" in
linux:include/linux/sched.h,

For example, whether field "numa_scan_seq" exists in the struct or not
depends on
config option CONFIG_NUMA_BALANCING.

It is totally possible vmlinux.h "task_struct" structure has field
"numa_scan_seq"
while the host include/linux/sched.h "task_struct" structure does not have
"numa_scan_seq" field.

multiple distinct types seems like it will make for confusion. e.g.,

struct foo { int a; };
struct __attribute__((whatever_we_name_it)) foo { float f; };

int func(struct foo *f) {
  return f->f + f->a;
}

Which lookup causes the error (or do they succeed somehow)?

In our use case, we intend to ensure struct field access *always*
coming from final structure. So, if the final resolved type is in
vmlinux.h, the field must be in vmlinux.h structure, otherwise,
issue an error. The same if the final resolved type is in host
kernel header file.

Therefore, in the above case, f->f is an error since final
resolved type is "struct foo { int a; }" and field "f" does not
exist in struct foo.

> It would be good if I can get some suggestions/directions on whether
> such a feature is possible or not for clang frontend.

What happens if the only definition seen in a TU is marked as being weak?

That weak will be promoted to the resolved type.

In terms of incompatible weak types, are these incompatible?

struct base {
  int a;
};

struct foo : base {
  int b;
};

struct __attribute__((whatever_we_name_it)) foo {
  int a, b;
}

Our use case is for C only.

I'm not comfortable with the idea of having two different type
definitions in the same TU unless it's clear and consistent which one
"wins". If the type definitions are the same between the types, then I
don't see a problem because that limits the chances for surprises.
When the type definitions differ, it becomes harder to reason about
the way lookup works, how the type is laid out at runtime, what the
ABI implications are, etc. Is there a consistent rule for which type
is the one to be honored (e.g., only one definition is allowed to
elide the attribute and that's the canonical definition of the type
which always "wins")?

All these are very good questions. To handle incompatible types,
yes, we need clear rules. Otherwise, it will cause confusion.
In our use case, the rule looks like below:
  - For a particular named type, if only one is defined with
    "weaktype" attribute, that type becomes resolved type.
  - If only one is defined without "weaktype" attribute,
    that type becomes resolved type.
  - If you have one "weaktype" and one non-weaktype types,
    the non-weaktype becomes the resolved type.
  - more than one weaktype type or more than one non-weaktype
    type will cause a compilation error.
  - any field access will be based on resolve type.

Does this sound reasonable? Or I could still miss some cases?

>
> >
> > Hi,
> >
> > This is a BPF use case. We would like to check whether
> > __attribute__((weak)) support for non-primitive types would be
> > possible in clang or not. For example, if we have two types like
> > struct t { int a; } __attribute__((weak));
> > struct t { int a; };
> >
> > typedef unsigned __u32;
> > typedef unsigned __u32 __attribute__((weak));
> > the compilation should not fail. It should just ignore weak definitions.
> >
> > In BPF, to support CO-RE (compile once and run everywhere), we
> > generate a *generic* vmlinux.h
> > (BPF Portability and CO-RE · BPF)
> > which is included in the bpf program. The bpf loader (libbpf) will
> > adjust field offsets properly based on host record layout so the same
> > program can run on different kernels which may have different record
> > layouts.
> >
> > But vmlinux.h does not support macros (and simple static inline helper
> > functions in kernel header files). Currently, users need to define
> > these macros and static inline functions explicitly in their bpf
> > program. What we are thinking is to permit users to include kernel
> > header files containing these macros/static inline functions. But this
> > may introduce duplicated types as these types have been defined in
> > vmlinux.h.
> >
> > For example, we may have:
> > socket.h:
> > struct socket { ... };
> > #define TCP 6
> > vmlinux.h:
> > struct socket { ... };
> > bpf.c:
> > #include <socket.h>
> > #include <vmlinux.h>
> > ... TCP ...
> >
> > In this case, struct "socket" is defined twice. If we can have something like
> > bpf.c:
> > #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> > #include <socket.h>
> > #pragma clang attribute pop
> > #include <vmlinux.h>
> > ... TCP ...
> >
> >
> > Then bpf program can get header file macro definitions without copying
> > explicitly.
> >
> > We need support for "apply_to = typedef" in the above pragma, so we
> > can deal with typedef types as well.
> >
> > Another issue is related to what if two types are not compatible. For example,
> > struct t { int a; } __attribute__((weak));
> > struct t { int a; int b; };
> > I think we could have a flag to control whether we allow incompatible
> > weak type or not.
>
> What's the use case for allowing incompatible weak types? Having

In our use case, we are talking about incompatible types between
   . vmlinux.h, generated with a particular kernel config which tries to
     contains all fields so it can be used for different kernel builds
   . kernel header files on a particular kernel build which may have
     less fields than vmlinux.

The below is a concrete example.
In linux kernel internal header files, many struct definition depends on
build time configuration. For example, struct "task_struct" in
linux:include/linux/sched.h,
https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
For example, whether field "numa_scan_seq" exists in the struct or not
depends on
config option CONFIG_NUMA_BALANCING.

It is totally possible vmlinux.h "task_struct" structure has field
"numa_scan_seq"
while the host include/linux/sched.h "task_struct" structure does not have
"numa_scan_seq" field.

Ah, thank you for the explanation!

> multiple distinct types seems like it will make for confusion. e.g.,
>
> struct foo { int a; };
> struct __attribute__((whatever_we_name_it)) foo { float f; };
>
> int func(struct foo *f) {
> return f->f + f->a;
> }
>
> Which lookup causes the error (or do they succeed somehow)?

In our use case, we intend to ensure struct field access *always*
coming from final structure.

Okay, so "last definition seen wins"?

I think it's a bit odd that the API designer can write a structure
definition and assume that it's the one the user will use, and then a
later definition can come along and silently override it. Say I have
the following:

// AwesomeAPI.h
struct Cool {
  int a;
  float f;
};

void DoAwesomeStuff(struct Cool *C);

// AwesomeAPI.c
#include "AwesomeAPI.h"
#include "string.h"

void DoAwesomeStuff(struct Cool *C) {
  C->a = 12;
  C->f = 1.2;
}

// LessAwesomeAPI.c
#include "AwesomeAPI.h"

struct __attribute__((whatever_we_name_this)) Cool {
  int a;
};

int main(void) {
  struct Cool C;
  DoAwesomeStuff(&C);
}

Assume these TUs are compiled separately and then linked together
later. Is there a buffer overflow in DoAwesomeStuff() because the Cool
object from LessAwesomeAPI.c is only sizeof(int) while the Cool object
expected within AwesomeAPI.c is sizeof(int) + sizeof(float)?

So, if the final resolved type is in
vmlinux.h, the field must be in vmlinux.h structure, otherwise,
issue an error. The same if the final resolved type is in host
kernel header file.

Therefore, in the above case, f->f is an error since final
resolved type is "struct foo { int a; }" and field "f" does not
exist in struct foo.

>
> > It would be good if I can get some suggestions/directions on whether
> > such a feature is possible or not for clang frontend.
>
> What happens if the only definition seen in a TU is marked as being weak?

That weak will be promoted to the resolved type.
>
> In terms of incompatible weak types, are these incompatible?
>
> struct base {
> int a;
> };
>
> struct foo : base {
> int b;
> };
>
> struct __attribute__((whatever_we_name_it)) foo {
> int a, b;
> }

Our use case is for C only.

That's good to know. In C, are these incompatible?

struct foo {
  int a;
  float f;
};

struct __attribute__((whatever_we_name_this)) foo {
  struct {
    int a;
    float f;
  };
};

(We could ask the same question using anonymous unions, etc.)

> I'm not comfortable with the idea of having two different type
> definitions in the same TU unless it's clear and consistent which one
> "wins". If the type definitions are the same between the types, then I
> don't see a problem because that limits the chances for surprises.
> When the type definitions differ, it becomes harder to reason about
> the way lookup works, how the type is laid out at runtime, what the
> ABI implications are, etc. Is there a consistent rule for which type
> is the one to be honored (e.g., only one definition is allowed to
> elide the attribute and that's the canonical definition of the type
> which always "wins")?

All these are very good questions. To handle incompatible types,
yes, we need clear rules. Otherwise, it will cause confusion.
In our use case, the rule looks like below:
  - For a particular named type, if only one is defined with
    "weaktype" attribute, that type becomes resolved type.
  - If only one is defined without "weaktype" attribute,
    that type becomes resolved type.
  - If you have one "weaktype" and one non-weaktype types,
    the non-weaktype becomes the resolved type.

Oh, this is different from my interpretation of "last one wins" above.
If so, you can reverse which definition the attribute is applied to
and have the same question about buffer overflows.

  - more than one weaktype type or more than one non-weaktype
    type will cause a compilation error.

Ah, so there will be exactly 0, 1, or 2 definitions?

  - any field access will be based on resolve type.

Does this sound reasonable? Or I could still miss some cases?

I apologize if I'm being dense, but I'm still trying to wrap my head
around how this works in practice. I'm worried about linking more than
one TU together where this resolution is different and getting ODR
violations as a result, getting weird situations where sizeof() the
type differs between TUs and causes issues, etc. I suppose one way to
dig into that is: if there was such a thing as a "type sanitizer"
(akin to UBSan but for types), what sort of functionality would you
expect it needs to help programmers detect bugs at runtime?

~Aaron

>
> >
> > >
> > > Hi,
> > >
> > > This is a BPF use case. We would like to check whether
> > > __attribute__((weak)) support for non-primitive types would be
> > > possible in clang or not. For example, if we have two types like
> > > struct t { int a; } __attribute__((weak));
> > > struct t { int a; };
> > >
> > > typedef unsigned __u32;
> > > typedef unsigned __u32 __attribute__((weak));
> > > the compilation should not fail. It should just ignore weak definitions.
> > >
> > > In BPF, to support CO-RE (compile once and run everywhere), we
> > > generate a *generic* vmlinux.h
> > > (BPF Portability and CO-RE · BPF)
> > > which is included in the bpf program. The bpf loader (libbpf) will
> > > adjust field offsets properly based on host record layout so the same
> > > program can run on different kernels which may have different record
> > > layouts.
> > >
> > > But vmlinux.h does not support macros (and simple static inline helper
> > > functions in kernel header files). Currently, users need to define
> > > these macros and static inline functions explicitly in their bpf
> > > program. What we are thinking is to permit users to include kernel
> > > header files containing these macros/static inline functions. But this
> > > may introduce duplicated types as these types have been defined in
> > > vmlinux.h.
> > >
> > > For example, we may have:
> > > socket.h:
> > > struct socket { ... };
> > > #define TCP 6
> > > vmlinux.h:
> > > struct socket { ... };
> > > bpf.c:
> > > #include <socket.h>
> > > #include <vmlinux.h>
> > > ... TCP ...
> > >
> > > In this case, struct "socket" is defined twice. If we can have something like
> > > bpf.c:
> > > #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> > > #include <socket.h>
> > > #pragma clang attribute pop
> > > #include <vmlinux.h>
> > > ... TCP ...
> > >
> > >
> > > Then bpf program can get header file macro definitions without copying
> > > explicitly.
> > >
> > > We need support for "apply_to = typedef" in the above pragma, so we
> > > can deal with typedef types as well.
> > >
> > > Another issue is related to what if two types are not compatible. For example,
> > > struct t { int a; } __attribute__((weak));
> > > struct t { int a; int b; };
> > > I think we could have a flag to control whether we allow incompatible
> > > weak type or not.
> >
> > What's the use case for allowing incompatible weak types? Having
>
> In our use case, we are talking about incompatible types between
> . vmlinux.h, generated with a particular kernel config which tries to
> contains all fields so it can be used for different kernel builds
> . kernel header files on a particular kernel build which may have
> less fields than vmlinux.
>
> The below is a concrete example.
> In linux kernel internal header files, many struct definition depends on
> build time configuration. For example, struct "task_struct" in
> linux:include/linux/sched.h,
> https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
> For example, whether field "numa_scan_seq" exists in the struct or not
> depends on
> config option CONFIG_NUMA_BALANCING.
>
> It is totally possible vmlinux.h "task_struct" structure has field
> "numa_scan_seq"
> while the host include/linux/sched.h "task_struct" structure does not have
> "numa_scan_seq" field.

Ah, thank you for the explanation!

After a little more thought, I think we will have vmlinux.h ahead of
socket.h to make compiler easy to resolve weak types.

bpf.c:
     #include <vmlinux.h>
     #pragma clang attribute push (__attribute__((weak)), apply_to = any)
     #include <socket.h>
     #pragma clang attribute pop
       ... TCP ...

This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
discarded.

> > multiple distinct types seems like it will make for confusion. e.g.,
> >
> > struct foo { int a; };
> > struct __attribute__((whatever_we_name_it)) foo { float f; };
> >
> > int func(struct foo *f) {
> > return f->f + f->a;
> > }
> >
> > Which lookup causes the error (or do they succeed somehow)?
>
> In our use case, we intend to ensure struct field access *always*
> coming from final structure.

Okay, so "last definition seen wins"?

I think it's a bit odd that the API designer can write a structure
definition and assume that it's the one the user will use, and then a
later definition can come along and silently override it. Say I have
the following:

// AwesomeAPI.h
struct Cool {
  int a;
  float f;
};

void DoAwesomeStuff(struct Cool *C);

// AwesomeAPI.c
#include "AwesomeAPI.h"
#include "string.h"

void DoAwesomeStuff(struct Cool *C) {
  C->a = 12;
  C->f = 1.2;
}

// LessAwesomeAPI.c
#include "AwesomeAPI.h"

struct __attribute__((whatever_we_name_this)) Cool {
  int a;
};

int main(void) {
  struct Cool C;
  DoAwesomeStuff(&C);
}

Assume these TUs are compiled separately and then linked together
later. Is there a buffer overflow in DoAwesomeStuff() because the Cool
object from LessAwesomeAPI.c is only sizeof(int) while the Cool object
expected within AwesomeAPI.c is sizeof(int) + sizeof(float)?

Thanks for the detailed explanation in the above. This indeed could make
it really hard to use for applications.

For the weaktype attribute, I am not sure whether there is a use case or
not outside the bpf community. How about we design the attribute to
be bpf specific. We can name the attribute to be "bpf_weaktype".

We can change rules to be ""last definition seen wins"".
  - if we only see bpf_weaktype types, last seen definition wins.
  - as soon as we see one non bpf_weaktype type, that one wins.
    Any later weaktype type will be ignored and and any later non-weaktype
    type will cause a redefine compilation error.

The following is a concrete example:
   struct t { int a; } __bpf_weaktype;
   struct d1 { struct t f; }; // use struct t {int a;}
   struct t { int b; int c; } __bpf_weaktype;
   struct d2 { struct t f; }; // use struct t {int b; int c; }
   struct t {char c; };
   struct d3 { strut t f; }; // use struct t { char c; }
   struct t {short s; } __bpf_weaktype;
   struct d4 { struct t f; }; // use struct t { char c; }
   struct t {long long l; }; // compilation error: redefinition

> So, if the final resolved type is in
> vmlinux.h, the field must be in vmlinux.h structure, otherwise,
> issue an error. The same if the final resolved type is in host
> kernel header file.
>
> Therefore, in the above case, f->f is an error since final
> resolved type is "struct foo { int a; }" and field "f" does not
> exist in struct foo.
>
>
>
> >
> > > It would be good if I can get some suggestions/directions on whether
> > > such a feature is possible or not for clang frontend.
> >
> > What happens if the only definition seen in a TU is marked as being weak?
>
> That weak will be promoted to the resolved type.
> >
> > In terms of incompatible weak types, are these incompatible?
> >
> > struct base {
> > int a;
> > };
> >
> > struct foo : base {
> > int b;
> > };
> >
> > struct __attribute__((whatever_we_name_it)) foo {
> > int a, b;
> > }
>
> Our use case is for C only.

That's good to know. In C, are these incompatible?

struct foo {
  int a;
  float f;
};

struct __attribute__((whatever_we_name_this)) foo {
  struct {
    int a;
    float f;
  };
};

(We could ask the same question using anonymous unions, etc.)

Based on what I described in the above, the first "struct foo" is a
non bpf_weaktype and
the second is a bpf_weaktype, so the second will be ignored regardless
of whether
they are compatible or not.

> > I'm not comfortable with the idea of having two different type
> > definitions in the same TU unless it's clear and consistent which one
> > "wins". If the type definitions are the same between the types, then I
> > don't see a problem because that limits the chances for surprises.
> > When the type definitions differ, it becomes harder to reason about
> > the way lookup works, how the type is laid out at runtime, what the
> > ABI implications are, etc. Is there a consistent rule for which type
> > is the one to be honored (e.g., only one definition is allowed to
> > elide the attribute and that's the canonical definition of the type
> > which always "wins")?
>
> All these are very good questions. To handle incompatible types,
> yes, we need clear rules. Otherwise, it will cause confusion.
> In our use case, the rule looks like below:
> - For a particular named type, if only one is defined with
> "weaktype" attribute, that type becomes resolved type.
> - If only one is defined without "weaktype" attribute,
> that type becomes resolved type.
> - If you have one "weaktype" and one non-weaktype types,
> the non-weaktype becomes the resolved type.

Oh, this is different from my interpretation of "last one wins" above.
If so, you can reverse which definition the attribute is applied to
and have the same question about buffer overflows.

> - more than one weaktype type or more than one non-weaktype
> type will cause a compilation error.

Ah, so there will be exactly 0, 1, or 2 definitions?

> - any field access will be based on resolve type.
>
> Does this sound reasonable? Or I could still miss some cases?

I apologize if I'm being dense, but I'm still trying to wrap my head
around how this works in practice. I'm worried about linking more than
one TU together where this resolution is different and getting ODR

Indeed sizeof() may get weird result even in my above example
as at different program point, sizeof() of the type might have
different values. This may propagate to other types as well.

violations as a result, getting weird situations where sizeof() the
type differs between TUs and causes issues, etc. I suppose one way to
dig into that is: if there was such a thing as a "type sanitizer"
(akin to UBSan but for types), what sort of functionality would you
expect it needs to help programmers detect bugs at runtime?

This is the reason that I would like to limit this attribute to be bpf only.
All bpf programs need to pass the kernel verifier before being able to
run in the kernel. They may get a program rejected by the kernel or
get an incorrect answer but not crash the kernel.

In bpf use case, I expect users mostly use the code pattern
     #include <vmlinux.h>
     #pragma clang attribute push (__attribute__((bpf_weaktype)),
apply_to = any)
     #include <socket.h>
     #pragma clang attribute pop
       ... TCP ...
       ... use types from vmlinux.h ...

so weird sizeof() issue shouldn't happen.

>
> >
> > >
> > > >
> > > > Hi,
> > > >
> > > > This is a BPF use case. We would like to check whether
> > > > __attribute__((weak)) support for non-primitive types would be
> > > > possible in clang or not. For example, if we have two types like
> > > > struct t { int a; } __attribute__((weak));
> > > > struct t { int a; };
> > > >
> > > > typedef unsigned __u32;
> > > > typedef unsigned __u32 __attribute__((weak));
> > > > the compilation should not fail. It should just ignore weak definitions.
> > > >
> > > > In BPF, to support CO-RE (compile once and run everywhere), we
> > > > generate a *generic* vmlinux.h
> > > > (BPF Portability and CO-RE · BPF)
> > > > which is included in the bpf program. The bpf loader (libbpf) will
> > > > adjust field offsets properly based on host record layout so the same
> > > > program can run on different kernels which may have different record
> > > > layouts.
> > > >
> > > > But vmlinux.h does not support macros (and simple static inline helper
> > > > functions in kernel header files). Currently, users need to define
> > > > these macros and static inline functions explicitly in their bpf
> > > > program. What we are thinking is to permit users to include kernel
> > > > header files containing these macros/static inline functions. But this
> > > > may introduce duplicated types as these types have been defined in
> > > > vmlinux.h.
> > > >
> > > > For example, we may have:
> > > > socket.h:
> > > > struct socket { ... };
> > > > #define TCP 6
> > > > vmlinux.h:
> > > > struct socket { ... };
> > > > bpf.c:
> > > > #include <socket.h>
> > > > #include <vmlinux.h>
> > > > ... TCP ...
> > > >
> > > > In this case, struct "socket" is defined twice. If we can have something like
> > > > bpf.c:
> > > > #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> > > > #include <socket.h>
> > > > #pragma clang attribute pop
> > > > #include <vmlinux.h>
> > > > ... TCP ...
> > > >
> > > >
> > > > Then bpf program can get header file macro definitions without copying
> > > > explicitly.
> > > >
> > > > We need support for "apply_to = typedef" in the above pragma, so we
> > > > can deal with typedef types as well.
> > > >
> > > > Another issue is related to what if two types are not compatible. For example,
> > > > struct t { int a; } __attribute__((weak));
> > > > struct t { int a; int b; };
> > > > I think we could have a flag to control whether we allow incompatible
> > > > weak type or not.
> > >
> > > What's the use case for allowing incompatible weak types? Having
> >
> > In our use case, we are talking about incompatible types between
> > . vmlinux.h, generated with a particular kernel config which tries to
> > contains all fields so it can be used for different kernel builds
> > . kernel header files on a particular kernel build which may have
> > less fields than vmlinux.
> >
> > The below is a concrete example.
> > In linux kernel internal header files, many struct definition depends on
> > build time configuration. For example, struct "task_struct" in
> > linux:include/linux/sched.h,
> > https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
> > For example, whether field "numa_scan_seq" exists in the struct or not
> > depends on
> > config option CONFIG_NUMA_BALANCING.
> >
> > It is totally possible vmlinux.h "task_struct" structure has field
> > "numa_scan_seq"
> > while the host include/linux/sched.h "task_struct" structure does not have
> > "numa_scan_seq" field.
>
> Ah, thank you for the explanation!

After a little more thought, I think we will have vmlinux.h ahead of
socket.h to make compiler easy to resolve weak types.

bpf.c:
     #include <vmlinux.h>
     #pragma clang attribute push (__attribute__((weak)), apply_to = any)
     #include <socket.h>
     #pragma clang attribute pop
       ... TCP ...

This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
discarded.

So the non-weak types are expected to be seen first, so you'll pick
the non-weak type every time unless the type only exists in weak form?
That seems reasonable.

> > > multiple distinct types seems like it will make for confusion. e.g.,
> > >
> > > struct foo { int a; };
> > > struct __attribute__((whatever_we_name_it)) foo { float f; };
> > >
> > > int func(struct foo *f) {
> > > return f->f + f->a;
> > > }
> > >
> > > Which lookup causes the error (or do they succeed somehow)?
> >
> > In our use case, we intend to ensure struct field access *always*
> > coming from final structure.
>
> Okay, so "last definition seen wins"?
>
> I think it's a bit odd that the API designer can write a structure
> definition and assume that it's the one the user will use, and then a
> later definition can come along and silently override it. Say I have
> the following:
>
> // AwesomeAPI.h
> struct Cool {
> int a;
> float f;
> };
>
> void DoAwesomeStuff(struct Cool *C);
>
> // AwesomeAPI.c
> #include "AwesomeAPI.h"
> #include "string.h"
>
> void DoAwesomeStuff(struct Cool *C) {
> C->a = 12;
> C->f = 1.2;
> }
>
> // LessAwesomeAPI.c
> #include "AwesomeAPI.h"
>
> struct __attribute__((whatever_we_name_this)) Cool {
> int a;
> };
>
> int main(void) {
> struct Cool C;
> DoAwesomeStuff(&C);
> }
>
> Assume these TUs are compiled separately and then linked together
> later. Is there a buffer overflow in DoAwesomeStuff() because the Cool
> object from LessAwesomeAPI.c is only sizeof(int) while the Cool object
> expected within AwesomeAPI.c is sizeof(int) + sizeof(float)?

Thanks for the detailed explanation in the above. This indeed could make
it really hard to use for applications.

For the weaktype attribute, I am not sure whether there is a use case or
not outside the bpf community. How about we design the attribute to
be bpf specific. We can name the attribute to be "bpf_weaktype".

That could help make the expected usage domain more clear, but we'll
still need to consider accidentally incorrect usage to help the bpf
community. In the above example, I think it's fine to say "if this
happens, it's UB" and we should probably document it as such.

We can change rules to be ""last definition seen wins"".
  - if we only see bpf_weaktype types, last seen definition wins.
  - as soon as we see one non bpf_weaktype type, that one wins.
    Any later weaktype type will be ignored and and any later non-weaktype
    type will cause a redefine compilation error.

I think that makes sense, though I think it could also make sense to
say "If we only see bpf_weaktype types, the first definition wins and
any later weaktype definition for the same type will be diagnosed as
an error. Effectively, you'll see 0, 1, or 2 definitions, but never
more than two.

The following is a concrete example:
   struct t { int a; } __bpf_weaktype;
   struct d1 { struct t f; }; // use struct t {int a;}
   struct t { int b; int c; } __bpf_weaktype;
   struct d2 { struct t f; }; // use struct t {int b; int c; }

The above bit could catch users by surprise. One potential way to
implement this functionality would be to use a two-pass system where
pass one finds all the type definitions and resolves them to one type
and pass two does the actual compilation using that resolved type. In
that case, a reasonable user might expect d2 to use the `int a;`
definition. However, if we made additional weak definitions be an
error, they'd never get into this situation. (And if we have to
support multiple weaktypes, then we could perhaps warn on the
situation where a weak definition was used somewhere and then another
weak definition of the same type is encountered, if that'd make sense
to do.)

   struct t {char c; };
   struct d3 { strut t f; }; // use struct t { char c; }
   struct t {short s; } __bpf_weaktype;
   struct d4 { struct t f; }; // use struct t { char c; }
   struct t {long long l; }; // compilation error: redefinition

>
> > So, if the final resolved type is in
> > vmlinux.h, the field must be in vmlinux.h structure, otherwise,
> > issue an error. The same if the final resolved type is in host
> > kernel header file.
> >
> > Therefore, in the above case, f->f is an error since final
> > resolved type is "struct foo { int a; }" and field "f" does not
> > exist in struct foo.
> >
> >
> >
> > >
> > > > It would be good if I can get some suggestions/directions on whether
> > > > such a feature is possible or not for clang frontend.
> > >
> > > What happens if the only definition seen in a TU is marked as being weak?
> >
> > That weak will be promoted to the resolved type.
> > >
> > > In terms of incompatible weak types, are these incompatible?
> > >
> > > struct base {
> > > int a;
> > > };
> > >
> > > struct foo : base {
> > > int b;
> > > };
> > >
> > > struct __attribute__((whatever_we_name_it)) foo {
> > > int a, b;
> > > }
> >
> > Our use case is for C only.
>
> That's good to know. In C, are these incompatible?
>
> struct foo {
> int a;
> float f;
> };
>
> struct __attribute__((whatever_we_name_this)) foo {
> struct {
> int a;
> float f;
> };
> };
>
> (We could ask the same question using anonymous unions, etc.)

Based on what I described in the above, the first "struct foo" is a
non bpf_weaktype and
the second is a bpf_weaktype, so the second will be ignored regardless
of whether
they are compatible or not.

Makes sense, thanks!

> > > I'm not comfortable with the idea of having two different type
> > > definitions in the same TU unless it's clear and consistent which one
> > > "wins". If the type definitions are the same between the types, then I
> > > don't see a problem because that limits the chances for surprises.
> > > When the type definitions differ, it becomes harder to reason about
> > > the way lookup works, how the type is laid out at runtime, what the
> > > ABI implications are, etc. Is there a consistent rule for which type
> > > is the one to be honored (e.g., only one definition is allowed to
> > > elide the attribute and that's the canonical definition of the type
> > > which always "wins")?
> >
> > All these are very good questions. To handle incompatible types,
> > yes, we need clear rules. Otherwise, it will cause confusion.
> > In our use case, the rule looks like below:
> > - For a particular named type, if only one is defined with
> > "weaktype" attribute, that type becomes resolved type.
> > - If only one is defined without "weaktype" attribute,
> > that type becomes resolved type.
> > - If you have one "weaktype" and one non-weaktype types,
> > the non-weaktype becomes the resolved type.
>
> Oh, this is different from my interpretation of "last one wins" above.
> If so, you can reverse which definition the attribute is applied to
> and have the same question about buffer overflows.
>
> > - more than one weaktype type or more than one non-weaktype
> > type will cause a compilation error.
>
> Ah, so there will be exactly 0, 1, or 2 definitions?
>
> > - any field access will be based on resolve type.
> >
> > Does this sound reasonable? Or I could still miss some cases?
>
> I apologize if I'm being dense, but I'm still trying to wrap my head
> around how this works in practice. I'm worried about linking more than
> one TU together where this resolution is different and getting ODR

Indeed sizeof() may get weird result even in my above example
as at different program point, sizeof() of the type might have
different values. This may propagate to other types as well.

> violations as a result, getting weird situations where sizeof() the
> type differs between TUs and causes issues, etc. I suppose one way to
> dig into that is: if there was such a thing as a "type sanitizer"
> (akin to UBSan but for types), what sort of functionality would you
> expect it needs to help programmers detect bugs at runtime?

This is the reason that I would like to limit this attribute to be bpf only.
All bpf programs need to pass the kernel verifier before being able to
run in the kernel. They may get a program rejected by the kernel or
get an incorrect answer but not crash the kernel.

Oh, that's good to know!

In bpf use case, I expect users mostly use the code pattern
     #include <vmlinux.h>
     #pragma clang attribute push (__attribute__((bpf_weaktype)),
apply_to = any)
     #include <socket.h>
     #pragma clang attribute pop
       ... TCP ...
       ... use types from vmlinux.h ...

so weird sizeof() issue shouldn't happen.

My gut instinct is that there will be weird issues from this because
of odd interactions between types (and possibly due to inline
functions in header files that use the types), but I think there's
enough here to be worth experimenting with.

Thanks!

~Aaron

>
> >
> > >
> > > >
> > > > >
> > > > > Hi,
> > > > >
> > > > > This is a BPF use case. We would like to check whether
> > > > > __attribute__((weak)) support for non-primitive types would be
> > > > > possible in clang or not. For example, if we have two types like
> > > > > struct t { int a; } __attribute__((weak));
> > > > > struct t { int a; };
> > > > >
> > > > > typedef unsigned __u32;
> > > > > typedef unsigned __u32 __attribute__((weak));
> > > > > the compilation should not fail. It should just ignore weak definitions.
> > > > >
> > > > > In BPF, to support CO-RE (compile once and run everywhere), we
> > > > > generate a *generic* vmlinux.h
> > > > > (BPF Portability and CO-RE · BPF)
> > > > > which is included in the bpf program. The bpf loader (libbpf) will
> > > > > adjust field offsets properly based on host record layout so the same
> > > > > program can run on different kernels which may have different record
> > > > > layouts.
> > > > >
> > > > > But vmlinux.h does not support macros (and simple static inline helper
> > > > > functions in kernel header files). Currently, users need to define
> > > > > these macros and static inline functions explicitly in their bpf
> > > > > program. What we are thinking is to permit users to include kernel
> > > > > header files containing these macros/static inline functions. But this
> > > > > may introduce duplicated types as these types have been defined in
> > > > > vmlinux.h.
> > > > >
> > > > > For example, we may have:
> > > > > socket.h:
> > > > > struct socket { ... };
> > > > > #define TCP 6
> > > > > vmlinux.h:
> > > > > struct socket { ... };
> > > > > bpf.c:
> > > > > #include <socket.h>
> > > > > #include <vmlinux.h>
> > > > > ... TCP ...
> > > > >
> > > > > In this case, struct "socket" is defined twice. If we can have something like
> > > > > bpf.c:
> > > > > #pragma clang attribute push (__attribute__((weak)), apply_to = record)
> > > > > #include <socket.h>
> > > > > #pragma clang attribute pop
> > > > > #include <vmlinux.h>
> > > > > ... TCP ...
> > > > >
> > > > >
> > > > > Then bpf program can get header file macro definitions without copying
> > > > > explicitly.
> > > > >
> > > > > We need support for "apply_to = typedef" in the above pragma, so we
> > > > > can deal with typedef types as well.
> > > > >
> > > > > Another issue is related to what if two types are not compatible. For example,
> > > > > struct t { int a; } __attribute__((weak));
> > > > > struct t { int a; int b; };
> > > > > I think we could have a flag to control whether we allow incompatible
> > > > > weak type or not.
> > > >
> > > > What's the use case for allowing incompatible weak types? Having
> > >
> > > In our use case, we are talking about incompatible types between
> > > . vmlinux.h, generated with a particular kernel config which tries to
> > > contains all fields so it can be used for different kernel builds
> > > . kernel header files on a particular kernel build which may have
> > > less fields than vmlinux.
> > >
> > > The below is a concrete example.
> > > In linux kernel internal header files, many struct definition depends on
> > > build time configuration. For example, struct "task_struct" in
> > > linux:include/linux/sched.h,
> > > https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
> > > For example, whether field "numa_scan_seq" exists in the struct or not
> > > depends on
> > > config option CONFIG_NUMA_BALANCING.
> > >
> > > It is totally possible vmlinux.h "task_struct" structure has field
> > > "numa_scan_seq"
> > > while the host include/linux/sched.h "task_struct" structure does not have
> > > "numa_scan_seq" field.
> >
> > Ah, thank you for the explanation!
>
> After a little more thought, I think we will have vmlinux.h ahead of
> socket.h to make compiler easy to resolve weak types.
>
> bpf.c:
> #include <vmlinux.h>
> #pragma clang attribute push (__attribute__((weak)), apply_to = any)
> #include <socket.h>
> #pragma clang attribute pop
> ... TCP ...
>
> This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
> discarded.

So the non-weak types are expected to be seen first, so you'll pick
the non-weak type every time unless the type only exists in weak form?
That seems reasonable.

> > > > multiple distinct types seems like it will make for confusion. e.g.,
> > > >
> > > > struct foo { int a; };
> > > > struct __attribute__((whatever_we_name_it)) foo { float f; };
> > > >
> > > > int func(struct foo *f) {
> > > > return f->f + f->a;
> > > > }
> > > >
> > > > Which lookup causes the error (or do they succeed somehow)?
> > >
> > > In our use case, we intend to ensure struct field access *always*
> > > coming from final structure.
> >
> > Okay, so "last definition seen wins"?
> >
> > I think it's a bit odd that the API designer can write a structure
> > definition and assume that it's the one the user will use, and then a
> > later definition can come along and silently override it. Say I have
> > the following:
> >
> > // AwesomeAPI.h
> > struct Cool {
> > int a;
> > float f;
> > };
> >
> > void DoAwesomeStuff(struct Cool *C);
> >
> > // AwesomeAPI.c
> > #include "AwesomeAPI.h"
> > #include "string.h"
> >
> > void DoAwesomeStuff(struct Cool *C) {
> > C->a = 12;
> > C->f = 1.2;
> > }
> >
> > // LessAwesomeAPI.c
> > #include "AwesomeAPI.h"
> >
> > struct __attribute__((whatever_we_name_this)) Cool {
> > int a;
> > };
> >
> > int main(void) {
> > struct Cool C;
> > DoAwesomeStuff(&C);
> > }
> >
> > Assume these TUs are compiled separately and then linked together
> > later. Is there a buffer overflow in DoAwesomeStuff() because the Cool
> > object from LessAwesomeAPI.c is only sizeof(int) while the Cool object
> > expected within AwesomeAPI.c is sizeof(int) + sizeof(float)?
>
> Thanks for the detailed explanation in the above. This indeed could make
> it really hard to use for applications.
>
> For the weaktype attribute, I am not sure whether there is a use case or
> not outside the bpf community. How about we design the attribute to
> be bpf specific. We can name the attribute to be "bpf_weaktype".

That could help make the expected usage domain more clear, but we'll
still need to consider accidentally incorrect usage to help the bpf
community. In the above example, I think it's fine to say "if this
happens, it's UB" and we should probably document it as such.

> We can change rules to be ""last definition seen wins"".
> - if we only see bpf_weaktype types, last seen definition wins.
> - as soon as we see one non bpf_weaktype type, that one wins.
> Any later weaktype type will be ignored and and any later non-weaktype
> type will cause a redefine compilation error.

I think that makes sense, though I think it could also make sense to
say "If we only see bpf_weaktype types, the first definition wins and
any later weaktype definition for the same type will be diagnosed as
an error. Effectively, you'll see 0, 1, or 2 definitions, but never
more than two.

What you suggested in above makes sense for us. In bpf use case,
we never see more than 2 definitions. It will be either
  (1). non weaktype, or
  (2). non weaktype followed by a weaktype, or
  (3). weaktype
I think any other cases w.r.t. types can be considered as an error.

> The following is a concrete example:
> struct t { int a; } __bpf_weaktype;
> struct d1 { struct t f; }; // use struct t {int a;}
> struct t { int b; int c; } __bpf_weaktype;
> struct d2 { struct t f; }; // use struct t {int b; int c; }

The above bit could catch users by surprise. One potential way to
implement this functionality would be to use a two-pass system where
pass one finds all the type definitions and resolves them to one type
and pass two does the actual compilation using that resolved type. In
that case, a reasonable user might expect d2 to use the `int a;`
definition. However, if we made additional weak definitions be an
error, they'd never get into this situation. (And if we have to
support multiple weaktypes, then we could perhaps warn on the
situation where a weak definition was used somewhere and then another
weak definition of the same type is encountered, if that'd make sense
to do.)

Thanks for the explanation. Let us not allow multiple weak types.
This is not a bpf use case. This should simplify implementation a lot.

> struct t {char c; };
> struct d3 { strut t f; }; // use struct t { char c; }
> struct t {short s; } __bpf_weaktype;
> struct d4 { struct t f; }; // use struct t { char c; }
> struct t {long long l; }; // compilation error: redefinition
>
> >
> > > So, if the final resolved type is in
> > > vmlinux.h, the field must be in vmlinux.h structure, otherwise,
> > > issue an error. The same if the final resolved type is in host
> > > kernel header file.
> > >
> > > Therefore, in the above case, f->f is an error since final
> > > resolved type is "struct foo { int a; }" and field "f" does not
> > > exist in struct foo.
> > >
> > >
> > >
> > > >
> > > > > It would be good if I can get some suggestions/directions on whether
> > > > > such a feature is possible or not for clang frontend.
> > > >
> > > > What happens if the only definition seen in a TU is marked as being weak?
> > >
> > > That weak will be promoted to the resolved type.
> > > >
> > > > In terms of incompatible weak types, are these incompatible?
> > > >
> > > > struct base {
> > > > int a;
> > > > };
> > > >
> > > > struct foo : base {
> > > > int b;
> > > > };
> > > >
> > > > struct __attribute__((whatever_we_name_it)) foo {
> > > > int a, b;
> > > > }
> > >
> > > Our use case is for C only.
> >
> > That's good to know. In C, are these incompatible?
> >
> > struct foo {
> > int a;
> > float f;
> > };
> >
> > struct __attribute__((whatever_we_name_this)) foo {
> > struct {
> > int a;
> > float f;
> > };
> > };
> >
> > (We could ask the same question using anonymous unions, etc.)
>
> Based on what I described in the above, the first "struct foo" is a
> non bpf_weaktype and
> the second is a bpf_weaktype, so the second will be ignored regardless
> of whether
> they are compatible or not.

Makes sense, thanks!

> > > > I'm not comfortable with the idea of having two different type
> > > > definitions in the same TU unless it's clear and consistent which one
> > > > "wins". If the type definitions are the same between the types, then I
> > > > don't see a problem because that limits the chances for surprises.
> > > > When the type definitions differ, it becomes harder to reason about
> > > > the way lookup works, how the type is laid out at runtime, what the
> > > > ABI implications are, etc. Is there a consistent rule for which type
> > > > is the one to be honored (e.g., only one definition is allowed to
> > > > elide the attribute and that's the canonical definition of the type
> > > > which always "wins")?
> > >
> > > All these are very good questions. To handle incompatible types,
> > > yes, we need clear rules. Otherwise, it will cause confusion.
> > > In our use case, the rule looks like below:
> > > - For a particular named type, if only one is defined with
> > > "weaktype" attribute, that type becomes resolved type.
> > > - If only one is defined without "weaktype" attribute,
> > > that type becomes resolved type.
> > > - If you have one "weaktype" and one non-weaktype types,
> > > the non-weaktype becomes the resolved type.
> >
> > Oh, this is different from my interpretation of "last one wins" above.
> > If so, you can reverse which definition the attribute is applied to
> > and have the same question about buffer overflows.
> >
> > > - more than one weaktype type or more than one non-weaktype
> > > type will cause a compilation error.
> >
> > Ah, so there will be exactly 0, 1, or 2 definitions?
> >
> > > - any field access will be based on resolve type.
> > >
> > > Does this sound reasonable? Or I could still miss some cases?
> >
> > I apologize if I'm being dense, but I'm still trying to wrap my head
> > around how this works in practice. I'm worried about linking more than
> > one TU together where this resolution is different and getting ODR
>
> Indeed sizeof() may get weird result even in my above example
> as at different program point, sizeof() of the type might have
> different values. This may propagate to other types as well.
>
> > violations as a result, getting weird situations where sizeof() the
> > type differs between TUs and causes issues, etc. I suppose one way to
> > dig into that is: if there was such a thing as a "type sanitizer"
> > (akin to UBSan but for types), what sort of functionality would you
> > expect it needs to help programmers detect bugs at runtime?
>
> This is the reason that I would like to limit this attribute to be bpf only.
> All bpf programs need to pass the kernel verifier before being able to
> run in the kernel. They may get a program rejected by the kernel or
> get an incorrect answer but not crash the kernel.

Oh, that's good to know!

> In bpf use case, I expect users mostly use the code pattern
> #include <vmlinux.h>
> #pragma clang attribute push (__attribute__((bpf_weaktype)),
> apply_to = any)
> #include <socket.h>
> #pragma clang attribute pop
> ... TCP ...
> ... use types from vmlinux.h ...
>
> so weird sizeof() issue shouldn't happen.

My gut instinct is that there will be weird issues from this because
of odd interactions between types (and possibly due to inline
functions in header files that use the types), but I think there's
enough here to be worth experimenting with.

Sounds great! We will start to do some prototyping and will have
a RFC diff for discussion once it is ready. Thanks!

In my opinion, this sounds like a hack to work around a problem that might be better solved outside the compiler. Or, at the very least, I think some more justification as to why this is the best solution would be helpful.

Restating the problem:

You have a tool which generates a “vmlinux.h” header file based on the currently-running kernel (e.g. by parsing /sys/kernel/btf/vmlinux). This allows users to avoid needing a copy of the kernel header files, which may not be readily available. However, the program cannot extract #defines – only types – because defines aren’t recorded in the kernel BTF debug info. So, despite having an easy way to generate vmlinux.h, users do still want to include the actual kernel headers sometimes, in order to access these constants. You want to allow that.

Your proposed solution is:
Modify the compiler so that it can ignore the duplicate struct definitions, in order to allow users to include both the kernel headers and the generated vmlinux.h header, even though they don’t cooperate with each-other.

But – is that really necessary? It seems a bit odd to invent a compiler feature to work around non-cooperating headers. Even more so when all the headers are part of the same project.

Here’s some ideas of possible alternatives. You may well have already thought of these and rejected them, but if so, it would be nice to hear why.

  1. Modify the original headers to have appropriate #ifndefs, in order to avoid conflicting.

  2. Tell users that they can either include either the autogenerated “vmlinux.h”, OR set the pragma and include normal kernel headers. But not both. AFAICT, the vmlinux.h file is not really version-independent – except that it uses the “preserve_access_index” pragma in order to allow field accesses to be adjusted at load-time to match the actual struct layout. Doesn’t putting the same pragma around the normal kernel headers work, too?

  3. Automatically extract #defines from kernel headers (e.g. using -dD to print all macro-expansions) to create a new header with those values. (Or defines and inline functions. Or everything but struct definitions. Or something along those lines). Have people include that, instead.

…Although, IIUC, you want use the kernel-internal headers e.g. “include/linux/socket.h”, not the stable “uapi” headers, e.g. “include/uapi/linux/socket.h”. But can’t those defines change arbitrarily? Doesn’t that ruin the version-independence aspect? So, maybe you want:

  1. Enhance BPF/CO:RE to support defines as compile-time-unknown/runtime-constants [at least some useful subset of them]. Then they too can be fixed up at BPF load-time, instead of hoping that the values never change.

Though, the example you give is “TCP” (IPPROTO_TCP?) which is in the uapi. Maybe you do need only the stable uapi defines? In which case,

  1. Create a set of modified uapi headers for use with bpf, which omits struct definitions, and instead has #include vmlinux.h at the top instead. Distribute with libbpf-dev, perhaps?

Hi,

This is a BPF use case. We would like to check whether
__attribute__((weak)) support for non-primitive types would be
possible in clang or not. For example, if we have two types like
    struct t { int a; } __attribute__((weak));
    struct t { int a; };

    typedef unsigned __u32;
    typedef unsigned __u32 __attribute__((weak));
the compilation should not fail. It should just ignore weak definitions.

In BPF, to support CO-RE (compile once and run everywhere), we
generate a *generic* vmlinux.h
(BPF Portability and CO-RE · BPF )
which is included in the bpf program. The bpf loader (libbpf) will
adjust field offsets properly based on host record layout so the same
program can run on different kernels which may have different record
layouts.

But vmlinux.h does not support macros (and simple static inline helper
functions in kernel header files). Currently, users need to define
these macros and static inline functions explicitly in their bpf
program. What we are thinking is to permit users to include kernel
header files containing these macros/static inline functions. But this
may introduce duplicated types as these types have been defined in
vmlinux.h.

For example, we may have:
     socket.h:
         struct socket { ... };
         #define TCP 6
     vmlinux.h:
         struct socket { ... };
     bpf.c:
         #include <socket.h>
         #include <vmlinux.h>
         ... TCP ...

In this case, struct "socket" is defined twice. If we can have something like
     bpf.c:
       #pragma clang attribute push (__attribute__((weak)), apply_to = record)
       #include <socket.h>
       #pragma clang attribute pop
       #include <vmlinux.h>
         ... TCP ...

Then bpf program can get header file macro definitions without copying
explicitly.

We need support for "apply_to = typedef" in the above pragma, so we
can deal with typedef types as well.

Another issue is related to what if two types are not compatible. For example,
     struct t { int a; } __attribute__((weak));
     struct t { int a; int b; };
I think we could have a flag to control whether we allow incompatible
weak type or not.

What's the use case for allowing incompatible weak types? Having

In our use case, we are talking about incompatible types between
    . vmlinux.h, generated with a particular kernel config which tries to
      contains all fields so it can be used for different kernel builds
    . kernel header files on a particular kernel build which may have
      less fields than vmlinux.

The below is a concrete example.
In linux kernel internal header files, many struct definition depends on
build time configuration. For example, struct "task_struct" in
linux:include/linux/sched.h,
https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
For example, whether field "numa_scan_seq" exists in the struct or not
depends on
config option CONFIG_NUMA_BALANCING.

It is totally possible vmlinux.h "task_struct" structure has field
"numa_scan_seq"
while the host include/linux/sched.h "task_struct" structure does not have
"numa_scan_seq" field.

Ah, thank you for the explanation!

After a little more thought, I think we will have vmlinux.h ahead of
socket.h to make compiler easy to resolve weak types.

bpf.c:
      #include <vmlinux.h>
      #pragma clang attribute push (__attribute__((weak)), apply_to = any)
      #include <socket.h>
      #pragma clang attribute pop
        ... TCP ...

This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
discarded.

So the non-weak types are expected to be seen first, so you'll pick
the non-weak type every time unless the type only exists in weak form?
That seems reasonable.

I'm sorry I'm a bit late to discussion, been out on vacation. But I'm super interested in this feature and the one who added vmlinux.h to BPF land, so wanted to provide some feedback as well.

Unless it complicates the implementation extremely, I hope we don't have to specify that non-weak type had to be first (or last, or any defined order). It would add unnecessary complexity and surprise for users. Think about weak symbols as an analogy. If linker demanded that non-weak symbol has to be encountered before weak symbol, it would complicate things for end users immensely.

Regarding the example that Yonghong provided. I was hoping that in practice it will be the other way: vmlinux.h internally has

#pragma clang attribute push (__attribute__((weak)), apply_to = any)
/* vmlinux.h types here */
#pragma clang attribute pop

and then user code looks just like this:

#include <vmlinux.h>
#include <socket.h>

So socket.h types actually would override vmlinux.h definitions, if they are conflicting. One hiccup with that is that for those types in socket.h, we'd like to "inherit" __attribute__((preserve_access_index), to make them BPF CO-RE-relocatable. Continuing analogy with weak symbols, it's similar to inheriting hidden symbol visibility, which can be defined on weak or extern declaration, but is applied to a final resolved symbol.

Would that be possible to support?

multiple distinct types seems like it will make for confusion. e.g.,

struct foo { int a; };
struct __attribute__((whatever_we_name_it)) foo { float f; };

int func(struct foo *f) {
   return f->f + f->a;
}

Which lookup causes the error (or do they succeed somehow)?

In our use case, we intend to ensure struct field access *always*
coming from final structure.

Okay, so "last definition seen wins"?

I think it's a bit odd that the API designer can write a structure
definition and assume that it's the one the user will use, and then a
later definition can come along and silently override it. Say I have
the following:

// AwesomeAPI.h
struct Cool {
   int a;
   float f;
};

void DoAwesomeStuff(struct Cool *C);

// AwesomeAPI.c
#include "AwesomeAPI.h"
#include "string.h"

void DoAwesomeStuff(struct Cool *C) {
   C->a = 12;
   C->f = 1.2;
}

// LessAwesomeAPI.c
#include "AwesomeAPI.h"

struct __attribute__((whatever_we_name_this)) Cool {
   int a;
};

int main(void) {
   struct Cool C;
   DoAwesomeStuff(&C);
}

Assume these TUs are compiled separately and then linked together
later. Is there a buffer overflow in DoAwesomeStuff() because the Cool
object from LessAwesomeAPI.c is only sizeof(int) while the Cool object
expected within AwesomeAPI.c is sizeof(int) + sizeof(float)?

Thanks for the detailed explanation in the above. This indeed could make
it really hard to use for applications.

For the weaktype attribute, I am not sure whether there is a use case or
not outside the bpf community. How about we design the attribute to
be bpf specific. We can name the attribute to be "bpf_weaktype".

That could help make the expected usage domain more clear, but we'll
still need to consider accidentally incorrect usage to help the bpf
community. In the above example, I think it's fine to say "if this
happens, it's UB" and we should probably document it as such.

I might be missing some details here, but I see two issues in the example provided above.

First, "unstable" sizeof(). In BPF CO-RE world, using plain sizeof() is an anti-pattern, unless you know that sizeof() value won't change (some fixed UAPI struct) or it can only ever increase and we don't really need to have entire struct (again, this will mostly be UAPI types). If you have no such guarantee and need to know exact actual sizeof(), there is a CO-RE relocation that compiler can emit, when user uses proper built-in (we provide a convenient macro for that with libbpf, bpf_core_type_size()).

Second problem is that if some part of source code assumed C->f existed, but actual resolved type doesn't have field f. If that's the case within single source file, then it's clearly a compilation error, right? If that happens in two different CUs, then it's not a problem, because each CU gets its own view of entire set of types, which will be recorded in corresponding DWARF and BTF type information.

I'm not sure how Clang would deal with that when linking multiple CUs through LTO built (sorry, I'm probably misusing the terminology). But if we use BPF static linking API through libbpf, it all will work out ok, I think, because we'll just merge these two incompatible BTF type definitions for struct Cool, and at runtime, if actual struct Cool has field f -- then all good. If not, it will be an error during loading/processing BPF program by libbpf.

We can change rules to be ""last definition seen wins"".
   - if we only see bpf_weaktype types, last seen definition wins.
   - as soon as we see one non bpf_weaktype type, that one wins.
     Any later weaktype type will be ignored and and any later non-weaktype
     type will cause a redefine compilation error.

I think that makes sense, though I think it could also make sense to
say "If we only see bpf_weaktype types, the first definition wins and
any later weaktype definition for the same type will be diagnosed as
an error. Effectively, you'll see 0, 1, or 2 definitions, but never
more than two.

What you suggested in above makes sense for us. In bpf use case,
we never see more than 2 definitions. It will be either
   (1). non weaktype, or
   (2). non weaktype followed by a weaktype, or
   (3). weaktype
I think any other cases w.r.t. types can be considered as an error.

Sorry, again, I have to disagree. That's adding too many assumptions.

One thing we didn't mention yet is kernel module BTFs. Right now we don't have the same vmlinux.h for kernel modules, but I think we will very soon. In such case some types might be defined (as weak_type) in both vmlinux.h and <some_kernel_module>.h. Maybe not, but unless these restrictions buy us tremendous amount of simplicity internally, I ask to not add unnecessary assumptions about ordering and count. We don't have them with weak symbols, so I think it makes it more natural to not have that for types.

The following is a concrete example:
    struct t { int a; } __bpf_weaktype;
    struct d1 { struct t f; }; // use struct t {int a;}
    struct t { int b; int c; } __bpf_weaktype;
    struct d2 { struct t f; }; // use struct t {int b; int c; }

The above bit could catch users by surprise. One potential way to
implement this functionality would be to use a two-pass system where
pass one finds all the type definitions and resolves them to one type
and pass two does the actual compilation using that resolved type. In
that case, a reasonable user might expect d2 to use the `int a;`
definition. However, if we made additional weak definitions be an
error, they'd never get into this situation. (And if we have to
support multiple weaktypes, then we could perhaps warn on the
situation where a weak definition was used somewhere and then another
weak definition of the same type is encountered, if that'd make sense
to do.)

Thanks for the explanation. Let us not allow multiple weak types.
This is not a bpf use case. This should simplify implementation a lot.

In BPF CO-RE world the actual size of the type and field offsets don't matter. What if __bpf_weaktype implies preserve_access_index automatically? Would that make it a non-issue? With BPF CO-RE and relocatable type definitions the point is to record field access intentions, not the actual memory layout (which will be different at runtime anyways). So in BPF CO-RE world the above situation isn't all that surprising.

    struct t {char c; };
    struct d3 { strut t f; }; // use struct t { char c; }
    struct t {short s; } __bpf_weaktype;
    struct d4 { struct t f; }; // use struct t { char c; }
    struct t {long long l; }; // compilation error: redefinition

So, if the final resolved type is in
vmlinux.h, the field must be in vmlinux.h structure, otherwise,
issue an error. The same if the final resolved type is in host
kernel header file.

Therefore, in the above case, f->f is an error since final
resolved type is "struct foo { int a; }" and field "f" does not
exist in struct foo.

It would be good if I can get some suggestions/directions on whether
such a feature is possible or not for clang frontend.

What happens if the only definition seen in a TU is marked as being weak?

That weak will be promoted to the resolved type.

In terms of incompatible weak types, are these incompatible?

struct base {
   int a;
};

struct foo : base {
   int b;
};

struct __attribute__((whatever_we_name_it)) foo {
   int a, b;
}

Our use case is for C only.

That's good to know. In C, are these incompatible?

struct foo {
   int a;
   float f;
};

struct __attribute__((whatever_we_name_this)) foo {
   struct {
     int a;
     float f;
   };
};

(We could ask the same question using anonymous unions, etc.)

Based on what I described in the above, the first "struct foo" is a
non bpf_weaktype and
the second is a bpf_weaktype, so the second will be ignored regardless
of whether
they are compatible or not.

Makes sense, thanks!

They are also completely compatible from BPF CO-RE standpoint. If kernel defines foo one way, and BPF program uses the other definition, it all works correctly with BPF CO-RE.

I'm not comfortable with the idea of having two different type
definitions in the same TU unless it's clear and consistent which one
"wins". If the type definitions are the same between the types, then I
don't see a problem because that limits the chances for surprises.
When the type definitions differ, it becomes harder to reason about
the way lookup works, how the type is laid out at runtime, what the
ABI implications are, etc. Is there a consistent rule for which type
is the one to be honored (e.g., only one definition is allowed to
elide the attribute and that's the canonical definition of the type
which always "wins")?

All these are very good questions. To handle incompatible types,
yes, we need clear rules. Otherwise, it will cause confusion.
In our use case, the rule looks like below:
   - For a particular named type, if only one is defined with
     "weaktype" attribute, that type becomes resolved type.
   - If only one is defined without "weaktype" attribute,
     that type becomes resolved type.
   - If you have one "weaktype" and one non-weaktype types,
     the non-weaktype becomes the resolved type.

Oh, this is different from my interpretation of "last one wins" above.
If so, you can reverse which definition the attribute is applied to
and have the same question about buffer overflows.

   - more than one weaktype type or more than one non-weaktype
     type will cause a compilation error.

Ah, so there will be exactly 0, 1, or 2 definitions?

   - any field access will be based on resolve type.

Does this sound reasonable? Or I could still miss some cases?

I apologize if I'm being dense, but I'm still trying to wrap my head
around how this works in practice. I'm worried about linking more than
one TU together where this resolution is different and getting ODR

Indeed sizeof() may get weird result even in my above example
as at different program point, sizeof() of the type might have
different values. This may propagate to other types as well.

See above. With BPF CO-RE, using straight sizeof() is discouraged. We have a relocatable way to fetch actual type (or field) size.

violations as a result, getting weird situations where sizeof() the
type differs between TUs and causes issues, etc. I suppose one way to
dig into that is: if there was such a thing as a "type sanitizer"
(akin to UBSan but for types), what sort of functionality would you
expect it needs to help programmers detect bugs at runtime?

This is the reason that I would like to limit this attribute to be bpf only.
All bpf programs need to pass the kernel verifier before being able to
run in the kernel. They may get a program rejected by the kernel or
get an incorrect answer but not crash the kernel.

Oh, that's good to know!

In bpf use case, I expect users mostly use the code pattern
      #include <vmlinux.h>
      #pragma clang attribute push (__attribute__((bpf_weaktype)),
apply_to = any)
      #include <socket.h>
      #pragma clang attribute pop
        ... TCP ...
        ... use types from vmlinux.h ...

so weird sizeof() issue shouldn't happen.

My gut instinct is that there will be weird issues from this because
of odd interactions between types (and possibly due to inline
functions in header files that use the types), but I think there's
enough here to be worth experimenting with.

Sounds great! We will start to do some prototyping and will have
a RFC diff for discussion once it is ready. Thanks!

Please consider the point about __bpf__weaktype implies preserve_access_index. __bpf_weaktype without CO-RE relocation is extremely confusing indeed, and not useful at all. Tying one to the other also solves the problem I mentioned at the very beginning, that I'd like vmlinux.h to have all the magic incantations for weaktype, while user code would just do #include <vmlinux.h> and #include <socket.h>, while making resolved types in socket.h automatically CO-RE-relocatable.

In my opinion, this sounds like a hack to work around a problem that might be better solved outside the compiler. Or, at the very least, I think some more justification as to why this is the best solution would be helpful.

Restating the problem:
You have a tool which generates a "vmlinux.h" header file based on the currently-running kernel (e.g. by parsing /sys/kernel/btf/vmlinux). This allows users to avoid needing a copy of the kernel header files, which may not be readily available. However, the program cannot extract #defines -- only types -- because defines aren't recorded in the kernel BTF debug info. So, despite having an easy way to generate vmlinux.h, users do still want to include the actual kernel headers sometimes, in order to access these constants. You want to allow that.

Correct.

Your proposed solution is:
Modify the compiler so that it can ignore the duplicate struct definitions, in order to allow users to include both the kernel headers and the generated vmlinux.h header, even though they don't cooperate with each-other.

But -- is that really necessary? It seems a bit odd to invent a compiler feature to work around non-cooperating headers. Even more so when all the headers are part of the same project.

We knew about this limitation from day 1 (and couldn't do much about it with the language features we have at our disposal), and it is brought up pretty regularly by various users, so it's a pretty annoying problem which we haven't found *any* way to solve satisfactory, except with extending Clang to help with it.

Here's some ideas of possible alternatives. You may well have already thought of these and rejected them, but if so, it would be nice to hear why.

1) Modify the original headers to have appropriate #ifndefs, in order to avoid conflicting.

Kernel community won't allow adding random #ifndefs into kernel headers (neither UAPI nor internal, don't even know which one would get more opposition), for the sake of BPF user applications. It's also unclear what #ifndefs you mean. Is the proposal to have, for each kernel type, a corresponding #define, e.g., #define __task_struct__defined for struct task_struct, and add #ifndef/#endif block for each such type? If yes, I don't see how that will fly with kernel community (and also consider that typical kernel configuration defines >80000 types).

2) Tell users that they can either include either the autogenerated "vmlinux.h", OR set the pragma and include normal kernel headers. But not both. AFAICT, the vmlinux.h file is not really version-independent -- except that it uses the "preserve_access_index" pragma in order to allow field accesses to be adjusted at load-time to match the actual struct layout. Doesn't putting the same pragma around the normal kernel headers work, too?

That's what we are telling users right now. And that's what users are complaining about. vmlinux.h is sufficiently version-independent with BPF CO-RE, unless some fields are removed/renamed, or you need the very latest new field added in more recent kernel than the one you used to generate vmlinux.h.

3) Automatically extract #defines from kernel headers (e.g. using -dD to print all macro-expansions) to create a new header with those values. (Or defines and inline functions. Or everything but struct definitions. Or something along those lines). Have people include that, instead.

...Although, IIUC, you want use the kernel-internal headers e.g. "include/linux/socket.h", not the stable "uapi" headers, e.g. "include/uapi/linux/socket.h". But can't those defines change arbitrarily? Doesn't that ruin the version-independence aspect? So, maybe you want:

Yes, users often need to work with types defined in kernel-internal headers. You are right that internal #defines are "unstable", so users are usually interested in stable #defines in UAPI headers. The problem is that vmlinux.h is all or nothing approach, currently. If you use vmlinux.h (because you need internal types, not just UAPI ones), you can't include UAPI headers (for those #defines) anymore due to type conflicts. That's exactly what Yonghong is trying to solve here.

4) Enhance BPF/CO:RE to support defines as compile-time-unknown/runtime-constants [at least some useful subset of them]. Then they too can be fixed up at BPF load-time, instead of hoping that the values never change.

#defines are not relocatable at runtime by any means, because they are just arbitrary text substituted by compiler. Once Clang compiled BPF program into object file, there is nothing you can do about #defines used. I don't see any way for BPF CO-RE to do "relocatable" #defines.

Though, the example you give is "TCP" (IPPROTO_TCP?) which /is/ in the uapi. Maybe you do need only the stable uapi defines? In which case,

5) Create a set of modified uapi headers for use with bpf, which omits struct definitions, and instead has #include vmlinux.h at the top instead. Distribute with libbpf-dev, perhaps?

That's not really a solution, rather a maintenance nightmare. It's never a good idea to maintain a separate copy of something that's developed by thousands of developers, and have hope to keep up with all the changes. If that's the only possible solution, then going back to status quo (making this user's problem and telling them no to use vmlinux.h at all) is a less painful way.

  1. Tell users that they can either include either the autogenerated “vmlinux.h”, OR set the pragma and include normal kernel headers. But not both. AFAICT, the vmlinux.h file is not really version-independent – except that it uses the “preserve_access_index” pragma in order to allow field accesses to be adjusted at load-time to match the actual struct layout. Doesn’t putting the same pragma around the normal kernel headers work, too?

That’s what we are telling users right now. And that’s what users are complaining about. vmlinux.h is sufficiently version-independent with BPF CO-RE, unless some fields are removed/renamed, or you need the very latest new field added in more recent kernel than the one you used to generate vmlinux.h.

I understand the problem with using only vmlinux.h. What is the complaint users have with using only the kernel-internal headers, though?

Using the type definitions from normal kernel headers will have a result which is just as version-independent, so long as you use #pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record), right?

  1. Automatically extract #defines from kernel headers (e.g. using -dD to print all macro-expansions) to create a new header with those values. (Or defines and inline functions. Or everything but struct definitions. Or something along those lines). Have people include that, instead.

…Although, IIUC, you want use the kernel-internal headers e.g. “include/linux/socket.h”, not the stable “uapi” headers, e.g. “include/uapi/linux/socket.h”. But can’t those defines change arbitrarily? Doesn’t that ruin the version-independence aspect? So, maybe you want:

Yes, users often need to work with types defined in kernel-internal headers. You are right that internal #defines are “unstable”, so users are usually interested in stable #defines in UAPI headers. The problem is that vmlinux.h is all or nothing approach, currently. If you use vmlinux.h (because you need internal types, not just UAPI ones), you can’t include UAPI headers (for those #defines) anymore due to type conflicts. That’s exactly what Yonghong is trying to solve here.

So, is the desire here to enable mixing uapi headers + vmlinux.h, or mix kernel internal headers + vmlinux.h, or it doesn’t really matter, either one would be fine, whichever can be made to work?

  1. Enhance BPF/CO:RE to support defines as compile-time-unknown/runtime-constants [at least some useful subset of them]. Then they too can be fixed up at BPF load-time, instead of hoping that the values never change.

#defines are not relocatable at runtime by any means, because they are just arbitrary text substituted by compiler. Once Clang compiled BPF program into object file, there is nothing you can do about #defines used. I don’t see any way for BPF CO-RE to do “relocatable” #defines.

In this proposal, I was imagining that the unstable values would be exposed as globals, instead, and treated as constants at load time. (Similar to what you’re doing with CONFIG_* values, where they are unknown globals at compile-time, and constants at load time.)

Though, the example you give is “TCP” (IPPROTO_TCP?) which is in the uapi. Maybe you do need only the stable uapi defines? In which case,

  1. Create a set of modified uapi headers for use with bpf, which omits struct definitions, and instead has #include vmlinux.h at the top instead. Distribute with libbpf-dev, perhaps?

That’s not really a solution, rather a maintenance nightmare. It’s never a good idea to maintain a separate copy of something that’s developed by thousands of developers, and have hope to keep up with all the changes. If that’s the only possible solution, then going back to status quo (making this user’s problem and telling them no to use vmlinux.h at all) is a less painful way.

I meant to say that this set of modified headers could be generated, not that you’d want to manually maintain a separate fork of it.

2) Tell users that they can either include either the autogenerated "vmlinux.h", OR set the pragma and include normal kernel headers. But not both. AFAICT, the vmlinux.h file is not really version-independent -- except that it uses the "preserve_access_index" pragma in order to allow field accesses to be adjusted at load-time to match the actual struct layout. Doesn't putting the same pragma around the normal kernel headers work, too?

That's what we are telling users right now. And that's what users are complaining about. vmlinux.h is sufficiently version-independent with BPF CO-RE, unless some fields are removed/renamed, or you need the very latest new field added in more recent kernel than the one you used to generate vmlinux.h.

I understand the problem with using only vmlinux.h. What is the complaint users have with using only the kernel-internal headers, though?

Users generally find it more complicated to use kernel headers instead
of vmlinux.h, which I suspect is #1 reason for them to do that in the
first place.

But I was also under the impression that you can't use internal kernel
headers easily, as I've seen users were just copy/pasting type
definitions as is from kernel sources instead.

Furthermore, vmlinux.h contains *all* kernel types, even if they were
defined inside .c files, so with vmlinux.h you get those as well,
while there is no header you can even theoretically include to get
them without just copy/pasting source code as is. BPF is used to do
some deep and involved kernel internals tracing, so it's quite common
to need to use such internal types from the BPF program (just in case
you thought I'm making this use case up).

Using the type definitions from normal kernel headers will have a result which is just as version-independent, so long as you use `#pragma clang attribute push (__attribute__((preserve_access_index)), apply_to = record)`, right?

Yes, but as described above only in a subset of desired scenarios. The
only generic solution seems to be __bpf_weaktype and it should handle
all known cases. And once we have that, we'll probably find some new
and creative ways to utilize it.

3) Automatically extract #defines from kernel headers (e.g. using -dD to print all macro-expansions) to create a new header with those values. (Or defines and inline functions. Or everything but struct definitions. Or something along those lines). Have people include that, instead.

...Although, IIUC, you want use the kernel-internal headers e.g. "include/linux/socket.h", not the stable "uapi" headers, e.g. "include/uapi/linux/socket.h". But can't those defines change arbitrarily? Doesn't that ruin the version-independence aspect? So, maybe you want:

Yes, users often need to work with types defined in kernel-internal headers. You are right that internal #defines are "unstable", so users are usually interested in stable #defines in UAPI headers. The problem is that vmlinux.h is all or nothing approach, currently. If you use vmlinux.h (because you need internal types, not just UAPI ones), you can't include UAPI headers (for those #defines) anymore due to type conflicts. That's exactly what Yonghong is trying to solve here.

So, is the desire here to enable mixing uapi headers + vmlinux.h, or mix kernel internal headers + vmlinux.h, or it doesn't really matter, either one would be fine, whichever can be made to work?

I didn't keep track of which headers exactly users want to mix-in with
vmlinux.h, so I suspect there are both cases. E.g., for UAPI people
needed some of the FS constants (include/uapi/linux/fcntl.h,
include/uapi/linux/fs.h, etc), while it's also common to need
include/net/sock.h with its definitions like `#define sk_dport
   __sk_common.skc_dport`. It's just two examples I could quickly
recall without searching the history.

4) Enhance BPF/CO:RE to support defines as compile-time-unknown/runtime-constants [at least some useful subset of them]. Then they too can be fixed up at BPF load-time, instead of hoping that the values never change.

#defines are not relocatable at runtime by any means, because they are just arbitrary text substituted by compiler. Once Clang compiled BPF program into object file, there is nothing you can do about #defines used. I don't see any way for BPF CO-RE to do "relocatable" #defines.

In this proposal, I was imagining that the unstable values would be exposed as globals, instead, and treated as constants at load time. (Similar to what you're doing with CONFIG_* values, where they are unknown globals at compile-time, and constants at load time.)

See #define sk_dport __sk_common.skc_dport example above, where it's
not just constants. There are also a bunch of static inline helper
functions for networking needs.

Though, the example you give is "TCP" (IPPROTO_TCP?) which is in the uapi. Maybe you do need only the stable uapi defines? In which case,

5) Create a set of modified uapi headers for use with bpf, which omits struct definitions, and instead has #include vmlinux.h at the top instead. Distribute with libbpf-dev, perhaps?

That's not really a solution, rather a maintenance nightmare. It's never a good idea to maintain a separate copy of something that's developed by thousands of developers, and have hope to keep up with all the changes. If that's the only possible solution, then going back to status quo (making this user's problem and telling them no to use vmlinux.h at all) is a less painful way.

I meant to say that this set of modified headers could be generated, not that you'd want to manually maintain a separate fork of it.

There are some experiments going on to use DWARF's capability to
record macros as text (not enabled by default, I think). We'd then add
them into BTF and use that to generate exactly the same #defines in
vmlinux.h or somewhere else. I think that's not the right way to go,
though, because it at least doubles the size of BTF data due to
recording tons of strings most of which are completely irrelevant and
will never be used (but we can't tell which one is which). And that
still doesn't cover the static inline functions. And we'll probably
find more limitations with that approach as we try to apply it to
real-world applications. Something like __bpf_weaktype seems like a
more optimal solution.

Hi,

This is a BPF use case. We would like to check whether
attribute((weak)) support for non-primitive types would be
possible in clang or not. For example, if we have two types like
struct t { int a; } attribute((weak));
struct t { int a; };

typedef unsigned __u32;
typedef unsigned __u32 attribute((weak));
the compilation should not fail. It should just ignore weak definitions.

In BPF, to support CO-RE (compile once and run everywhere), we
generate a generic vmlinux.h
(https://facebookmicrosites.github.io/bpf/blog/2020/02/19/bpf-portability-and-co-re.html )
which is included in the bpf program. The bpf loader (libbpf) will
adjust field offsets properly based on host record layout so the same
program can run on different kernels which may have different record
layouts.

But vmlinux.h does not support macros (and simple static inline helper
functions in kernel header files). Currently, users need to define
these macros and static inline functions explicitly in their bpf
program. What we are thinking is to permit users to include kernel
header files containing these macros/static inline functions. But this
may introduce duplicated types as these types have been defined in
vmlinux.h.

For example, we may have:
socket.h:
struct socket { … };
#define TCP 6
vmlinux.h:
struct socket { … };
bpf.c:
#include <socket.h>
#include <vmlinux.h>
… TCP …

In this case, struct “socket” is defined twice. If we can have something like
bpf.c:
#pragma clang attribute push (attribute((weak)), apply_to = record)
#include <socket.h>
#pragma clang attribute pop
#include <vmlinux.h>
… TCP …

Then bpf program can get header file macro definitions without copying
explicitly.

We need support for “apply_to = typedef” in the above pragma, so we
can deal with typedef types as well.

Another issue is related to what if two types are not compatible. For example,
struct t { int a; } attribute((weak));
struct t { int a; int b; };
I think we could have a flag to control whether we allow incompatible
weak type or not.
What’s the use case for allowing incompatible weak types? Having
In our use case, we are talking about incompatible types between
. vmlinux.h, generated with a particular kernel config which tries to
contains all fields so it can be used for different kernel builds
. kernel header files on a particular kernel build which may have
less fields than vmlinux.

The below is a concrete example.
In linux kernel internal header files, many struct definition depends on
build time configuration. For example, struct “task_struct” in
linux:include/linux/sched.h,
https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
For example, whether field “numa_scan_seq” exists in the struct or not
depends on
config option CONFIG_NUMA_BALANCING.

It is totally possible vmlinux.h “task_struct” structure has field
“numa_scan_seq”
while the host include/linux/sched.h “task_struct” structure does not have
“numa_scan_seq” field.
Ah, thank you for the explanation!
After a little more thought, I think we will have vmlinux.h ahead of
socket.h to make compiler easy to resolve weak types.

bpf.c:
#include <vmlinux.h>
#pragma clang attribute push (attribute((weak)), apply_to = any)
#include <socket.h>
#pragma clang attribute pop
… TCP …

This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
discarded.
So the non-weak types are expected to be seen first, so you’ll pick
the non-weak type every time unless the type only exists in weak form?
That seems reasonable.

I’m sorry I’m a bit late to discussion, been out on vacation. But I’m
super interested in this feature and the one who added vmlinux.h to BPF
land, so wanted to provide some feedback as well.

Unless it complicates the implementation extremely, I hope we don’t have
to specify that non-weak type had to be first (or last, or any defined
order). It would add unnecessary complexity and surprise for users.

I think we will very likely need to require that a non-weak definition of the type is seen before any weak definition. Otherwise we risk the possibility that the weak definition of the type is used in some way before we see the “real” definition, which will lead to serious problems. For example, the compiler could compute the struct’s layout based on the first (weak) definition, then later find that it should discard that definition, but it might be too late: we might have already made hard-to-undo decisions based on that information. Discarding a weak definition when we’ve already seen a non-weak one doesn’t have this kind of problem.

If that makes the use of a compiler-based approach untenable for your use case, you should look hard for other alternatives. Asking us to support replacing the definition of a type after the fact would be imposing a very high cost on us.

>>>>>>> Hi,
>>>>>>>
>>>>>>> This is a BPF use case. We would like to check whether
>>>>>>> __attribute__((weak)) support for non-primitive types would be
>>>>>>> possible in clang or not. For example, if we have two types like
>>>>>>> struct t { int a; } __attribute__((weak));
>>>>>>> struct t { int a; };
>>>>>>>
>>>>>>> typedef unsigned __u32;
>>>>>>> typedef unsigned __u32 __attribute__((weak));
>>>>>>> the compilation should not fail. It should just ignore weak definitions.
>>>>>>>
>>>>>>> In BPF, to support CO-RE (compile once and run everywhere), we
>>>>>>> generate a *generic* vmlinux.h
>>>>>>> (BPF Portability and CO-RE · BPF )
>>>>>>> which is included in the bpf program. The bpf loader (libbpf) will
>>>>>>> adjust field offsets properly based on host record layout so the same
>>>>>>> program can run on different kernels which may have different record
>>>>>>> layouts.
>>>>>>>
>>>>>>> But vmlinux.h does not support macros (and simple static inline helper
>>>>>>> functions in kernel header files). Currently, users need to define
>>>>>>> these macros and static inline functions explicitly in their bpf
>>>>>>> program. What we are thinking is to permit users to include kernel
>>>>>>> header files containing these macros/static inline functions. But this
>>>>>>> may introduce duplicated types as these types have been defined in
>>>>>>> vmlinux.h.
>>>>>>>
>>>>>>> For example, we may have:
>>>>>>> socket.h:
>>>>>>> struct socket { ... };
>>>>>>> #define TCP 6
>>>>>>> vmlinux.h:
>>>>>>> struct socket { ... };
>>>>>>> bpf.c:
>>>>>>> #include <socket.h>
>>>>>>> #include <vmlinux.h>
>>>>>>> ... TCP ...
>>>>>>>
>>>>>>> In this case, struct "socket" is defined twice. If we can have something like
>>>>>>> bpf.c:
>>>>>>> #pragma clang attribute push (__attribute__((weak)), apply_to = record)
>>>>>>> #include <socket.h>
>>>>>>> #pragma clang attribute pop
>>>>>>> #include <vmlinux.h>
>>>>>>> ... TCP ...
>>>>>>>
>>>>>>>
>>>>>>> Then bpf program can get header file macro definitions without copying
>>>>>>> explicitly.
>>>>>>>
>>>>>>> We need support for "apply_to = typedef" in the above pragma, so we
>>>>>>> can deal with typedef types as well.
>>>>>>>
>>>>>>> Another issue is related to what if two types are not compatible. For example,
>>>>>>> struct t { int a; } __attribute__((weak));
>>>>>>> struct t { int a; int b; };
>>>>>>> I think we could have a flag to control whether we allow incompatible
>>>>>>> weak type or not.
>>>>>> What's the use case for allowing incompatible weak types? Having
>>>>> In our use case, we are talking about incompatible types between
>>>>> . vmlinux.h, generated with a particular kernel config which tries to
>>>>> contains all fields so it can be used for different kernel builds
>>>>> . kernel header files on a particular kernel build which may have
>>>>> less fields than vmlinux.
>>>>>
>>>>> The below is a concrete example.
>>>>> In linux kernel internal header files, many struct definition depends on
>>>>> build time configuration. For example, struct "task_struct" in
>>>>> linux:include/linux/sched.h,
>>>>> https://github.com/torvalds/linux/blob/master/include/linux/sched.h#L661-L1418
>>>>> For example, whether field "numa_scan_seq" exists in the struct or not
>>>>> depends on
>>>>> config option CONFIG_NUMA_BALANCING.
>>>>>
>>>>> It is totally possible vmlinux.h "task_struct" structure has field
>>>>> "numa_scan_seq"
>>>>> while the host include/linux/sched.h "task_struct" structure does not have
>>>>> "numa_scan_seq" field.
>>>> Ah, thank you for the explanation!
>>> After a little more thought, I think we will have vmlinux.h ahead of
>>> socket.h to make compiler easy to resolve weak types.
>>>
>>> bpf.c:
>>> #include <vmlinux.h>
>>> #pragma clang attribute push (__attribute__((weak)), apply_to = any)
>>> #include <socket.h>
>>> #pragma clang attribute pop
>>> ... TCP ...
>>>
>>> This way, any type in socket.h, if it is also in vmlinux.h, will be quickly
>>> discarded.
>> So the non-weak types are expected to be seen first, so you'll pick
>> the non-weak type every time unless the type only exists in weak form?
>> That seems reasonable.

I'm sorry I'm a bit late to discussion, been out on vacation. But I'm
super interested in this feature and the one who added vmlinux.h to BPF
land, so wanted to provide some feedback as well.

Unless it complicates the implementation extremely, I hope we don't have
to specify that non-weak type had to be first (or last, or any defined
order). It would add unnecessary complexity and surprise for users.

I think we will very likely need to require that a non-weak definition of the type is seen before any weak definition. Otherwise we risk the possibility that the weak definition of the type is used in some way before we see the "real" definition, which will lead to serious problems. For example, the compiler could compute the struct's layout based on the first (weak) definition, then later find that it should discard that definition, but it might be too late: we might have already made hard-to-undo decisions based on that information. Discarding a weak definition when we've already seen a non-weak one doesn't have this kind of problem.

If that makes the use of a compiler-based approach untenable for your use case, you should look hard for other alternatives. Asking us to support replacing the definition of a type after the fact would be imposing a very high cost on us.

Thanks for clarification. Yes, we are aware of this potential issue
during discussion with Aaron. We are working on a POC now and
certainly we don't want to make implementation complicated. What you
mentioned totally makes sense and we will keep this in mind. Thanks!