New powerpc vdso calling convention

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

Is there a good reason for the system call fallback to go in the vdso
function rather than have the caller handle it?

Thanks,
Nick

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

musl doesn't use ELFv1, but my expectation would be for the kernel to
provide an ELFv1 VDSO to an ELFv1 process. (I'm pretty sure the kernel
has to be aware of this property of the process-image (executable
file) since it affects how signals work.)

Is there a good reason for the system call fallback to go in the vdso
function rather than have the caller handle it?

Originally it was deemed the vdso's responsibility to do fallback, but
MIPS broke this contract so musl always makes a syscall itself if the
vdso function returns -ENOSYS. I believe it honors other errors. We
could change it to fallback on all errors if needed. I'm not sure what
glibc does here.

Rich

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

I think doing this is a real good idea.

I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the main pitfall has been that our vdso calling convention is not compatible with C calling convention, so we have go through an ASM entry/exit.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469

We should kill this error flag return through CR[SO] and get it the "modern" way like other architectectures implementing the C VDSO: return 0 when successfull, return -err when failed.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say yes, it would be good to have it to avoid going through ASM in the middle.

Is there a good reason for the system call fallback to go in the vdso
function rather than have the caller handle it?

I've seen at least one while porting powerpc to the C VDSO: arguments toward VDSO functions are in volatile registers. If the caller has to call the fallback by itself, it has to save them before calling the VDSO, allthought in 99% of cases it won't use them again. With the fallback called by the VDSO itself, the arguments are still hot in volatile registers and ready for calling the fallback. That make it very easy to call them, see patch 5 in the series (https://patchwork.ozlabs.org/project/linuxppc-dev/patch/59bea35725ab4cefc67a678577da8b3ab7771af5.1587401492.git.christophe.leroy@c-s.fr/)

Thanks,
Nick

Christophe

Excerpts from Christophe Leroy's message of April 25, 2020 5:47 pm:

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

I think doing this is a real good idea.

I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the
main pitfall has been that our vdso calling convention is not compatible
with C calling convention, so we have go through an ASM entry/exit.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469

We should kill this error flag return through CR[SO] and get it the
"modern" way like other architectectures implementing the C VDSO: return
0 when successfull, return -err when failed.

Agreed.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
yes, it would be good to have it to avoid going through ASM in the middle.

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

# define VDSO_IFUNC_RET(value) \
  ({ \
    static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
    vdso_opd.fd_func = (Elf64_Addr)value; \
    &vdso_opd; \
  })

If we could make something which links more like any other dso with
ELFv1, that would be good. Otherwise I think v2 is preferable so it
doesn't have to calculate its own address.

Is there a good reason for the system call fallback to go in the vdso
function rather than have the caller handle it?

I've seen at least one while porting powerpc to the C VDSO: arguments
toward VDSO functions are in volatile registers. If the caller has to
call the fallback by itself, it has to save them before calling the
VDSO, allthought in 99% of cases it won't use them again. With the
fallback called by the VDSO itself, the arguments are still hot in
volatile registers and ready for calling the fallback. That make it very
easy to call them, see patch 5 in the series
(https://patchwork.ozlabs.org/project/linuxppc-dev/patch/59bea35725ab4cefc67a678577da8b3ab7771af5.1587401492.git.christophe.leroy@c-s.fr/)

I see. Well the kernel can probably patch in sc or scv depending on
which is supported, so we could keep the automatic fallback.

Thanks,
Nick

I see the following in glibc. So looks like PPC32 is like PPC64 elfv1. By the way, they are talking about something not completely finished in the kernel. Can we finish it ?

#if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2
/* The correct solution is for _dl_vdso_vsym to return the address of the OPD
    for the kernel VDSO function. That address would then be stored in the
    __vdso_* variables and returned as the result of the IFUNC resolver function.
    Yet, the kernel does not contain any OPD entries for the VDSO functions
    (incomplete implementation). However, PLT relocations for IFUNCs still expect
    the address of an OPD to be returned from the IFUNC resolver function (since
    PLT entries on PPC64 are just copies of OPDs). The solution for now is to
    create an artificial static OPD for each VDSO function returned by a resolver
    function. The TOC value is set to a non-zero value to avoid triggering lazy
    symbol resolution via .glink0/.plt0 for a zero TOC (requires thread-safe PLT
    sequences) when the dynamic linker isn't prepared for it e.g. RTLD_NOW. None
    of the kernel VDSO routines use the TOC or AUX values so any non-zero value
    will work. Note that function pointer comparisons will not use this artificial
    static OPD since those are resolved via ADDR64 relocations and will point at
    the non-IFUNC default OPD for the symbol. Lastly, because the IFUNC relocations
    are processed immediately at startup the resolver functions and this code need
    not be thread-safe, but if the caller writes to a PLT slot it must do so in a
    thread-safe manner with all the required barriers. */
#define VDSO_IFUNC_RET(value) \
   ({ \
     static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
     vdso_opd.fd_func = (Elf64_Addr)value; \
     &vdso_opd; \
   })
#else
#define VDSO_IFUNC_RET(value) ((void *) (value))
#endif

Christophe

>> The ELF v2 ABI convention would suit it well, because the caller already
>> requires the function address for ctr, so having it in r12 will
>> eliminate the need for address calculation, which suits the vdso data
>> page access.
>>
>> Is there a need for ELF v1 specific calls as well, or could those just be
>> deprecated and remain on existing functions or required to use the ELF
>> v2 calls using asm wrappers?
>
> What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
> yes, it would be good to have it to avoid going through ASM in the middle..

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

A function descriptor could be put in the VDSO data page, or as it's
done now by glibc the vdso linkage code could create it. My leaning is
to at least have a version of the code that's callable (with the right
descriptor around it) by v1 binaries, but since musl does not use
ELFv1 at all we really have no stake in this and I'm fine with
whatever outcome users of v1 decide on.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

I'm pretty sure it does know because signal invocation has to know
whether the function pointer points to a descriptor or code. At least
for FDPIC archs (similar to PPC64 ELFv1 function descriptors) it knows
and has to know.

>> Is there a good reason for the system call fallback to go in the vdso
>> function rather than have the caller handle it?
>
> I've seen at least one while porting powerpc to the C VDSO: arguments
> toward VDSO functions are in volatile registers. If the caller has to
> call the fallback by itself, it has to save them before calling the
> VDSO, allthought in 99% of cases it won't use them again. With the
> fallback called by the VDSO itself, the arguments are still hot in
> volatile registers and ready for calling the fallback. That make it very
> easy to call them, see patch 5 in the series
> (https://patchwork.ozlabs.org/project/linuxppc-dev/patch/59bea35725ab4cefc67a678577da8b3ab7771af5.1587401492.git.christophe.leroy@c-s.fr/)

This is actually a good reason not to spuriously fail and fallback. At
present musl wouldn't take advantage of it because musl uses the
fallback path for lazy initialization of the vdso function pointer and
doesn't special-case the MIPS badness, but if it made a big difference
we probably could shuffle things around to only do the fallback on
archs that need it and avoid saving the input arg registers across the
vdso call.

Rich

Excerpts from Christophe Leroy's message of April 25, 2020 10:20 pm:

Excerpts from Christophe Leroy's message of April 25, 2020 5:47 pm:

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

I think doing this is a real good idea.

I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the
main pitfall has been that our vdso calling convention is not compatible
with C calling convention, so we have go through an ASM entry/exit.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469

We should kill this error flag return through CR[SO] and get it the
"modern" way like other architectectures implementing the C VDSO: return
0 when successfull, return -err when failed.

Agreed.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
yes, it would be good to have it to avoid going through ASM in the middle.

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

# define VDSO_IFUNC_RET(value) \
   ({ \
     static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
     vdso_opd.fd_func = (Elf64_Addr)value; \
     &vdso_opd; \
   })

If we could make something which links more like any other dso with
ELFv1, that would be good. Otherwise I think v2 is preferable so it
doesn't have to calculate its own address.

I see the following in glibc. So looks like PPC32 is like PPC64 elfv1.
By the way, they are talking about something not completely finished in
the kernel. Can we finish it ?

Possibly can. It seems like a good idea to fix all loose ends if we are
going to add new versions. Will have to check with the toolchain people
to make sure we're doing the right thing.

Thanks,
Nick

Excerpts from Rich Felker's message of April 26, 2020 2:22 am:

>> The ELF v2 ABI convention would suit it well, because the caller already
>> requires the function address for ctr, so having it in r12 will
>> eliminate the need for address calculation, which suits the vdso data
>> page access.
>>
>> Is there a need for ELF v1 specific calls as well, or could those just be
>> deprecated and remain on existing functions or required to use the ELF
>> v2 calls using asm wrappers?
>
> What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
> yes, it would be good to have it to avoid going through ASM in the middle..

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

A function descriptor could be put in the VDSO data page, or as it's
done now by glibc the vdso linkage code could create it. My leaning is
to at least have a version of the code that's callable (with the right
descriptor around it) by v1 binaries, but since musl does not use
ELFv1 at all we really have no stake in this and I'm fine with
whatever outcome users of v1 decide on.

I agree, I think it would be good to make it look as much like a normal
function as possible.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

I'm pretty sure it does know because signal invocation has to know
whether the function pointer points to a descriptor or code. At least
for FDPIC archs (similar to PPC64 ELFv1 function descriptors) it knows
and has to know.

It knows on a per-executable basis (by looking at the ELF header). It
doesn't know per-system though so we can't patch the vdso accordingly.
But we could include both sets of entry points and map in the
appropriate one at exec time I think.

>> Is there a good reason for the system call fallback to go in the vdso
>> function rather than have the caller handle it?
>
> I've seen at least one while porting powerpc to the C VDSO: arguments
> toward VDSO functions are in volatile registers. If the caller has to
> call the fallback by itself, it has to save them before calling the
> VDSO, allthought in 99% of cases it won't use them again. With the
> fallback called by the VDSO itself, the arguments are still hot in
> volatile registers and ready for calling the fallback. That make it very
> easy to call them, see patch 5 in the series
> (https://patchwork.ozlabs.org/project/linuxppc-dev/patch/59bea35725ab4cefc67a678577da8b3ab7771af5.1587401492.git.christophe.leroy@c-s.fr/)

This is actually a good reason not to spuriously fail and fallback. At
present musl wouldn't take advantage of it because musl uses the
fallback path for lazy initialization of the vdso function pointer and
doesn't special-case the MIPS badness, but if it made a big difference
we probably could shuffle things around to only do the fallback on
archs that need it and avoid saving the input arg registers across the
vdso call.

It's a point for it yes. I don't know if any libc or app would want to
instrument it or do special accounting or something for system calls.

Thanks,
Nick

"ELFv1" and "ELFv2" are PPC64-specific names for the old and new
version of the ELF psABI for PPC64. They have nothing at all to do
with PPC32 which is a completely different ABI from either.

Rich

Excerpts from Rich Felker's message of April 26, 2020 9:11 am:

Excerpts from Christophe Leroy's message of April 25, 2020 10:20 pm:
>
>
>> Excerpts from Christophe Leroy's message of April 25, 2020 5:47 pm:
>>>
>>>
>>>> As noted in the 'scv' thread, powerpc's vdso calling convention does not
>>>> match the C ELF ABI calling convention (or the proposed scv convention).
>>>> I think we could implement a new ABI by basically duplicating function
>>>> entry points with different names.
>>>
>>> I think doing this is a real good idea.
>>>
>>> I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the
>>> main pitfall has been that our vdso calling convention is not compatible
>>> with C calling convention, so we have go through an ASM entry/exit.
>>>
>>> See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469
>>>
>>> We should kill this error flag return through CR[SO] and get it the
>>> "modern" way like other architectectures implementing the C VDSO: return
>>> 0 when successfull, return -err when failed.
>>
>> Agreed.
>>
>>>> The ELF v2 ABI convention would suit it well, because the caller already
>>>> requires the function address for ctr, so having it in r12 will
>>>> eliminate the need for address calculation, which suits the vdso data
>>>> page access.
>>>>
>>>> Is there a need for ELF v1 specific calls as well, or could those just be
>>>> deprecated and remain on existing functions or required to use the ELF
>>>> v2 calls using asm wrappers?
>>>
>>> What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
>>> yes, it would be good to have it to avoid going through ASM in the middle.
>>
>> I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
>> their address in r12 if called at their global entry point. ELFv1 have a
>> function descriptor with call address and TOC in it, caller has to load
>> the TOC if it's global.
>>
>> The vdso doesn't have TOC, it has one global address (the vdso data
>> page) which it loads by calculating its own address.
>>
>> The kernel doesn't change the vdso based on whether it's called by a v1
>> or v2 userspace (it doesn't really know itself and would have to export
>> different functions). glibc has a hack to create something:
>>
>> # define VDSO_IFUNC_RET(value) \
>> ({ \
>> static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
>> vdso_opd.fd_func = (Elf64_Addr)value; \
>> &vdso_opd; \
>> })
>>
>> If we could make something which links more like any other dso with
>> ELFv1, that would be good. Otherwise I think v2 is preferable so it
>> doesn't have to calculate its own address.
>
> I see the following in glibc. So looks like PPC32 is like PPC64 elfv1.
> By the way, they are talking about something not completely finished in
> the kernel. Can we finish it ?

Possibly can. It seems like a good idea to fix all loose ends if we are
going to add new versions. Will have to check with the toolchain people
to make sure we're doing the right thing.

"ELFv1" and "ELFv2" are PPC64-specific names for the old and new
version of the ELF psABI for PPC64. They have nothing at all to do
with PPC32 which is a completely different ABI from either.

Right, I'm just talking about those comments -- it seems like the kernel
vdso should contain an .opd section with function descriptors in it for
elfv1 calls, rather than the hack it has now of creating one in the
caller's .data section.

But all that function descriptor code is gated by

#if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2

So it seems PPC32 does not use function descriptors but a direct pointer
to the entry point like PPC64 with ELFv2.

Thanks,
Nick

Yes, this hack is only for ELFv1. The missing ODP has not been an issue
or glibc because it has been using the inline assembly to emulate the
functions call since initial vDSO support (INTERNAL_VSYSCALL_CALL_TYPE).
It just has become an issue when I added a ifunc optimization to
gettimeofday so it can bypass the libc.so and make plt branch to vDSO
directly.

Recently on some y2038 refactoring it was suggested to get rid of this
and make gettimeofday call clock_gettime regardless. But some felt that
the performance degradation was not worth for a symbol that is still used
extensibility, so we stuck with the hack.

And I think having this synthetic opd entry is not an issue, since for
full relro the program's will be used and correctly set as read-only.
The issue is more for glibc itself, and I wouldn't mind to just remove the
gettimeofday and time optimizations and use the default vDSO support
(which might increase in function latency though).

As Rich has put, it would be simpler to just have powerpc vDSO symbols
to have a default function call semantic so we could issue a function
call directly. But for powerpc64, we glibc will need to continue to
support this non-standard call on older kernels and I am not sure if
adding new symbols with a different semantic will help much.

GLibc already hides this powerpc semantic on INTERNAL_VSYSCALL_CALL_TYPE,
so internally all syscalls are assumed to have the new semantic (-errno
on error, 0 on success). Adding another ELFv1 would require to add
more logic to handle multiple symbol version for vDSO setup
(sysdeps/unix/sysv/linux/dl-vdso-setup.h), which would mostly likely to
require an arch-specific implementation to handle it.

Excerpts from Adhemerval Zanella's message of April 27, 2020 11:09 pm:

Excerpts from Rich Felker's message of April 26, 2020 9:11 am:

Excerpts from Christophe Leroy's message of April 25, 2020 10:20 pm:

Excerpts from Christophe Leroy's message of April 25, 2020 5:47 pm:

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

I think doing this is a real good idea.

I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the
main pitfall has been that our vdso calling convention is not compatible
with C calling convention, so we have go through an ASM entry/exit.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469

We should kill this error flag return through CR[SO] and get it the
"modern" way like other architectectures implementing the C VDSO: return
0 when successfull, return -err when failed.

Agreed.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
yes, it would be good to have it to avoid going through ASM in the middle.

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

# define VDSO_IFUNC_RET(value) \
   ({ \
     static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
     vdso_opd.fd_func = (Elf64_Addr)value; \
     &vdso_opd; \
   })

If we could make something which links more like any other dso with
ELFv1, that would be good. Otherwise I think v2 is preferable so it
doesn't have to calculate its own address.

I see the following in glibc. So looks like PPC32 is like PPC64 elfv1.
By the way, they are talking about something not completely finished in
the kernel. Can we finish it ?

Possibly can. It seems like a good idea to fix all loose ends if we are
going to add new versions. Will have to check with the toolchain people
to make sure we're doing the right thing.

"ELFv1" and "ELFv2" are PPC64-specific names for the old and new
version of the ELF psABI for PPC64. They have nothing at all to do
with PPC32 which is a completely different ABI from either.

Right, I'm just talking about those comments -- it seems like the kernel
vdso should contain an .opd section with function descriptors in it for
elfv1 calls, rather than the hack it has now of creating one in the
caller's .data section.

But all that function descriptor code is gated by

#if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2

So it seems PPC32 does not use function descriptors but a direct pointer
to the entry point like PPC64 with ELFv2.

Yes, this hack is only for ELFv1. The missing ODP has not been an issue
or glibc because it has been using the inline assembly to emulate the
functions call since initial vDSO support (INTERNAL_VSYSCALL_CALL_TYPE).
It just has become an issue when I added a ifunc optimization to
gettimeofday so it can bypass the libc.so and make plt branch to vDSO
directly.

I can't understand if it's actually a problem for you or not.

Regardless if you can hack around it, it seems to me that if we're going
to add sane calling conventions to the vdso, then we should also just
have a .opd section for it as well, whether or not a particular libc
requires it.

Recently on some y2038 refactoring it was suggested to get rid of this
and make gettimeofday call clock_gettime regardless. But some felt that
the performance degradation was not worth for a symbol that is still used
extensibility, so we stuck with the hack.

And I think having this synthetic opd entry is not an issue, since for
full relro the program's will be used and correctly set as read-only.

I'm not quite sure what this means, I don't really know how glibc ifunc
works. How do you set r2 if you have no opd?

The issue is more for glibc itself, and I wouldn't mind to just remove the
gettimeofday and time optimizations and use the default vDSO support
(which might increase in function latency though).

As Rich has put, it would be simpler to just have powerpc vDSO symbols
to have a default function call semantic so we could issue a function
call directly. But for powerpc64, we glibc will need to continue to
support this non-standard call on older kernels and I am not sure if
adding new symbols with a different semantic will help much.

Yeah, we will add entry points with default function call semantics.
At which point we make the things look like any other dso unless there
is good reason otherwise.

GLibc already hides this powerpc semantic on INTERNAL_VSYSCALL_CALL_TYPE,
so internally all syscalls are assumed to have the new semantic (-errno
on error, 0 on success). Adding another ELFv1 would require to add
more logic to handle multiple symbol version for vDSO setup
(sysdeps/unix/sysv/linux/dl-vdso-setup.h), which would mostly likely to
require an arch-specific implementation to handle it.

Is it not a build-time choice? The arch can set its own vdso symbol
names AFAIKS.

Thanks,
Nick

Excerpts from Adhemerval Zanella's message of April 27, 2020 11:09 pm:

Excerpts from Rich Felker's message of April 26, 2020 9:11 am:

Excerpts from Christophe Leroy's message of April 25, 2020 10:20 pm:

Excerpts from Christophe Leroy's message of April 25, 2020 5:47 pm:

As noted in the 'scv' thread, powerpc's vdso calling convention does not
match the C ELF ABI calling convention (or the proposed scv convention).
I think we could implement a new ABI by basically duplicating function
entry points with different names.

I think doing this is a real good idea.

I've been working at porting powerpc VDSO to the GENERIC C VDSO, and the
main pitfall has been that our vdso calling convention is not compatible
with C calling convention, so we have go through an ASM entry/exit.

See https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=171469

We should kill this error flag return through CR[SO] and get it the
"modern" way like other architectectures implementing the C VDSO: return
0 when successfull, return -err when failed.

Agreed.

The ELF v2 ABI convention would suit it well, because the caller already
requires the function address for ctr, so having it in r12 will
eliminate the need for address calculation, which suits the vdso data
page access.

Is there a need for ELF v1 specific calls as well, or could those just be
deprecated and remain on existing functions or required to use the ELF
v2 calls using asm wrappers?

What's ELF v1 and ELF v2 ? Is ELF v1 what PPC32 uses ? If so, I'd say
yes, it would be good to have it to avoid going through ASM in the middle.

I'm not sure about PPC32. On PPC64, ELFv2 functions must be called with
their address in r12 if called at their global entry point. ELFv1 have a
function descriptor with call address and TOC in it, caller has to load
the TOC if it's global.

The vdso doesn't have TOC, it has one global address (the vdso data
page) which it loads by calculating its own address.

The kernel doesn't change the vdso based on whether it's called by a v1
or v2 userspace (it doesn't really know itself and would have to export
different functions). glibc has a hack to create something:

# define VDSO_IFUNC_RET(value) \
   ({ \
     static Elf64_FuncDesc vdso_opd = { .fd_toc = ~0x0 }; \
     vdso_opd.fd_func = (Elf64_Addr)value; \
     &vdso_opd; \
   })

If we could make something which links more like any other dso with
ELFv1, that would be good. Otherwise I think v2 is preferable so it
doesn't have to calculate its own address.

I see the following in glibc. So looks like PPC32 is like PPC64 elfv1.
By the way, they are talking about something not completely finished in
the kernel. Can we finish it ?

Possibly can. It seems like a good idea to fix all loose ends if we are
going to add new versions. Will have to check with the toolchain people
to make sure we're doing the right thing.

"ELFv1" and "ELFv2" are PPC64-specific names for the old and new
version of the ELF psABI for PPC64. They have nothing at all to do
with PPC32 which is a completely different ABI from either.

Right, I'm just talking about those comments -- it seems like the kernel
vdso should contain an .opd section with function descriptors in it for
elfv1 calls, rather than the hack it has now of creating one in the
caller's .data section.

But all that function descriptor code is gated by

#if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2

So it seems PPC32 does not use function descriptors but a direct pointer
to the entry point like PPC64 with ELFv2.

Yes, this hack is only for ELFv1. The missing ODP has not been an issue
or glibc because it has been using the inline assembly to emulate the
functions call since initial vDSO support (INTERNAL_VSYSCALL_CALL_TYPE).
It just has become an issue when I added a ifunc optimization to
gettimeofday so it can bypass the libc.so and make plt branch to vDSO
directly.

I can't understand if it's actually a problem for you or not.

Regardless if you can hack around it, it seems to me that if we're going
to add sane calling conventions to the vdso, then we should also just
have a .opd section for it as well, whether or not a particular libc
requires it.

The main problem for glibc is the complication of having to handle two
different calling conventions. Specially if kernel starts to provide
new vDSO symbols with only with the new semantic.

But I think it is doable, it will require some internal tinkering on
how to handle vDSO (to indicate which mechanism to use) which will
most likely be powerpc specific.

Recently on some y2038 refactoring it was suggested to get rid of this
and make gettimeofday call clock_gettime regardless. But some felt that
the performance degradation was not worth for a symbol that is still used
extensibility, so we stuck with the hack.

And I think having this synthetic opd entry is not an issue, since for
full relro the program's will be used and correctly set as read-only.

I'm not quite sure what this means, I don't really know how glibc ifunc
works. How do you set r2 if you have no opd?

IFUNC itself is not an issue here, since it just a dynamic relocation that
instruct the dynamic linker to issue a defined function that provides the
actual symbol. The problem is symbol resolution for kernel vDSO symbol
that returns a pointer to the text segment instead of the expected ODP
entry.

And currently glibc assumes that kernel vDSO does not use TOC or AUX,
so it sets a bogus value (~0x0) just to avoid trigger lazy resolution
in some cases. It makes sense with the current contract that vDSO calls
should behave as syscall, but lesser the flexibility of kernel
implementation.

The issue is more for glibc itself, and I wouldn't mind to just remove the
gettimeofday and time optimizations and use the default vDSO support
(which might increase in function latency though).

As Rich has put, it would be simpler to just have powerpc vDSO symbols
to have a default function call semantic so we could issue a function
call directly. But for powerpc64, we glibc will need to continue to
support this non-standard call on older kernels and I am not sure if
adding new symbols with a different semantic will help much.

Yeah, we will add entry points with default function call semantics.
At which point we make the things look like any other dso unless there
is good reason otherwise.

I think the move to make vDSO has the same semantic as an usual DSO is
the correct one. I am just pointing out that different than musl, glibc
already support vDSO for powerpc and changing its interface will most
likely require more handling in powerpc specific bits.

GLibc already hides this powerpc semantic on INTERNAL_VSYSCALL_CALL_TYPE,
so internally all syscalls are assumed to have the new semantic (-errno
on error, 0 on success). Adding another ELFv1 would require to add
more logic to handle multiple symbol version for vDSO setup
(sysdeps/unix/sysv/linux/dl-vdso-setup.h), which would mostly likely to
require an arch-specific implementation to handle it.

Is it not a build-time choice? The arch can set its own vdso symbol
names AFAIKS.

To enable vDSO support the architecture just need to define the
correspondent macros with the expected names. For instance, for powerpc:

sysdeps/unix/sysv/linux/powerpc/sysdep.h
[...]
195 #if defined(__PPC64__) || defined(__powerpc64__)
196 #define HAVE_CLOCK_GETRES64_VSYSCALL "__kernel_clock_getres"
197 #define HAVE_CLOCK_GETTIME64_VSYSCALL "__kernel_clock_gettime"
198 #else
199 #define HAVE_CLOCK_GETRES_VSYSCALL "__kernel_clock_getres"
200 #define HAVE_CLOCK_GETTIME_VSYSCALL "__kernel_clock_gettime"
201 #endif
202 #define HAVE_GETCPU_VSYSCALL "__kernel_getcpu"
203 #define HAVE_TIME_VSYSCALL "__kernel_time"
204 #define HAVE_GETTIMEOFDAY_VSYSCALL "__kernel_gettimeofday"
205 #define HAVE_GET_TBFREQ "__kernel_get_tbfreq"
[...]

GLIBC will create and initialize the vDSO pointers in a arch neutral
way, however the vDSO call itself is parametrized to handle the
powerpc specific bits (the INTERNAL_VSYSCALL_CALL_TYPE which is called
by INLINE_SYSCALL_CALL).

Rich Felker <dalias@libc.org> writes:

Hi!

Excerpts from Adhemerval Zanella's message of April 27, 2020 11:09 pm:
>> Right, I'm just talking about those comments -- it seems like the kernel
>> vdso should contain an .opd section with function descriptors in it for
>> elfv1 calls, rather than the hack it has now of creating one in the
>> caller's .data section.
>>
>> But all that function descriptor code is gated by
>>
>> #if (defined(__PPC64__) || defined(__powerpc64__)) && _CALL_ELF != 2
>>
>> So it seems PPC32 does not use function descriptors but a direct pointer
>> to the entry point like PPC64 with ELFv2.
>
> Yes, this hack is only for ELFv1. The missing ODP has not been an issue
> or glibc because it has been using the inline assembly to emulate the
> functions call since initial vDSO support (INTERNAL_VSYSCALL_CALL_TYPE).
> It just has become an issue when I added a ifunc optimization to
> gettimeofday so it can bypass the libc.so and make plt branch to vDSO
> directly.

I can't understand if it's actually a problem for you or not.

Regardless if you can hack around it, it seems to me that if we're going
to add sane calling conventions to the vdso, then we should also just
have a .opd section for it as well, whether or not a particular libc
requires it.

An OPD ("official procedure descriptor") is required for every function,
to have proper C semantics, so that pointers to functions (which are
pointers to descriptors, in fact) are unique. You can "manually" make
descriptors just fine, and use those to call functions -- but you cannot
(in general) use a pointer to such a "fake" descriptor as the "id" of
the function.

The way the ABIs define the OPDs makes them guaranteed unique.

Segher