Are __builtin_setjmp / __builtin_longjmp part of the ABI?

From: "John McCall" <rjmccall@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Joerg Sonnenberger" <joerg@britannica.bec.de>, "<hans@hanshq.net>" <hans@hanshq.net>, "cfe-commits"
<cfe-commits@cs.uiuc.edu>
Sent: Tuesday, March 3, 2015 2:05:53 AM
Subject: Re: r230255 - Only lower __builtin_setjmp / __builtin_longjmp to

>
>> From: "Joerg Sonnenberger" <joerg@britannica.bec.de>
>> To: "John McCall" <rjmccall@gmail.com>
>> Cc: hans@hanshq.net, "cfe-commits" <cfe-commits@cs.uiuc.edu>
>> Sent: Tuesday, March 3, 2015 1:07:20 AM
>> Subject: Re: r230255 - Only lower __builtin_setjmp /
>> __builtin_longjmp to
>>
>>> This patch is pretty scary. __builtin_setjmp/longjmp are
>>> definitely not
>>> just libc functions with a __builtin_ prefix attached. They do
>>> not
>>> interoperate with setjmp/longjmp and expect a significantly
>>> smaller
>>> buffer,
>>> so silently rewriting them to setjmp/longjmp is ABI-breaking.
>>> This
>>> might
>>> fix Ruby, but only if Ruby is actually passing a full jmp_buf,
>>> and
>>> only if
>>> everything that does a __builtin_setjmp/__builtin_longjmp is
>>> recompiled in
>>> a way that does the rewrite. I'm very concerned about this
>>> introducing ABI
>>> problems for a Clang-compiled Ruby with GCC-compiled extensions
>>> or
>>> vice-versa. FWIW, Ruby seems to already have target-specific
>>> configuration
>>> logic for when to use them.
>>
>> Ruby has no target-specific configuration.

Literally the first google result for "Ruby __builtin_setjmp" is a
core commit turning it off for a target. Am I misunderstanding
something?

GCC implements this builtin with much more generality than we do,
probably because GCC predates widespread adoption of libUnwind and
we don't. Hal's comment about not trying to match GCC on PPC aside
(and I find that comment pretty troubling!),

Hi John,

Fair enough; let's move this question of whether __builtin_setjmp / __builtin_longjmp are part of the ABI to a separate thread. When implementing these for the PowerPC backend, I followed the same scheme as is used by the X86 backend, and so while this is likely compatible with GCC, I did not consider GCC compatibility a specific goal. Speed and correctness were all I cared about.

My rational for not considering them to be part of the ABI is that:

1. They don't appear in any system header
2. They don't appear in the system ABI documentation
3. They're not universally supported by all compilers targeting the architecture

Generically, our implementation of __builtin_* (__builtin_<math func> for example) differs in observable ways from GCC's implementation. This does mean that passing a jmp_buf filled in by __builtin_setjmp across an interface boundary is a bad idea, but I never would have considered that to be a good idea anyway (because you already need to know it can only be used with __builtin_longjmp and not longjmp).

To be clear, I do consider the format of jmp_buf used by setjmp/longjmp to be part of the ABI, but the GCC implementation of __builtin_setjmp / __builtin_longjmp, AFAIK, and our X86 implementation, differ from that (by design) anyway.

Thanks again,
Hal

> From: "John McCall" <rjmccall@gmail.com>
> To: "Hal Finkel" <hfinkel@anl.gov>
> Cc: "Joerg Sonnenberger" <joerg@britannica.bec.de>, "<hans@hanshq.net>"
<hans@hanshq.net>, "cfe-commits"
> <cfe-commits@cs.uiuc.edu>
> Sent: Tuesday, March 3, 2015 2:05:53 AM
> Subject: Re: r230255 - Only lower __builtin_setjmp / __builtin_longjmp to
>
> >
> >> From: "Joerg Sonnenberger" <joerg@britannica.bec.de>
> >> To: "John McCall" <rjmccall@gmail.com>
> >> Cc: hans@hanshq.net, "cfe-commits" <cfe-commits@cs.uiuc.edu>
> >> Sent: Tuesday, March 3, 2015 1:07:20 AM
> >> Subject: Re: r230255 - Only lower __builtin_setjmp /
> >> __builtin_longjmp to
> >>
> >>> This patch is pretty scary. __builtin_setjmp/longjmp are
> >>> definitely not
> >>> just libc functions with a __builtin_ prefix attached. They do
> >>> not
> >>> interoperate with setjmp/longjmp and expect a significantly
> >>> smaller
> >>> buffer,
> >>> so silently rewriting them to setjmp/longjmp is ABI-breaking.
> >>> This
> >>> might
> >>> fix Ruby, but only if Ruby is actually passing a full jmp_buf,
> >>> and
> >>> only if
> >>> everything that does a __builtin_setjmp/__builtin_longjmp is
> >>> recompiled in
> >>> a way that does the rewrite. I'm very concerned about this
> >>> introducing ABI
> >>> problems for a Clang-compiled Ruby with GCC-compiled extensions
> >>> or
> >>> vice-versa. FWIW, Ruby seems to already have target-specific
> >>> configuration
> >>> logic for when to use them.
> >>
> >> Ruby has no target-specific configuration.
>
> Literally the first google result for "Ruby __builtin_setjmp" is a
> core commit turning it off for a target. Am I misunderstanding
> something?
>
> GCC implements this builtin with much more generality than we do,
> probably because GCC predates widespread adoption of libUnwind and
> we don't. Hal's comment about not trying to match GCC on PPC aside
> (and I find that comment pretty troubling!),

Hi John,

Fair enough; let's move this question of whether __builtin_setjmp /
__builtin_longjmp are part of the ABI to a separate thread.

Thanks.

When implementing these for the PowerPC backend, I followed the same
scheme as is used by the X86 backend, and so while this is likely
compatible with GCC, I did not consider GCC compatibility a specific goal.
Speed and correctness were all I cared about.

My rational for not considering them to be part of the ABI is that:

1. They don't appear in any system header

2. They don't appear in the system ABI documentation

3. They're not universally supported by all compilers targeting the
architecture

I think the crucial question is whether there's a reasonable assumption of
interoperation with the compilers that do implement them. I think there's
an expectation that these functions have a particular straightforward and
efficient implementation; lowering them to less-efficient and incompatible
functions that the user could simply have called themselves seems pointless.

Generically, our implementation of __builtin_* (__builtin_<math func> for

example) differs in observable ways from GCC's implementation.

That's true, but I think this is distinguishable.

This does mean that passing a jmp_buf filled in by __builtin_setjmp across
an interface boundary is a bad idea, but I never would have considered that
to be a good idea anyway

It's necessary if you want to interoperate with e.g. an exception or
continuation mechanism that's specified to use __builtin_setjmp or
__builtin_longjmp. Most UNIX-like platforms these days use DWARF unwinding
for things like C++ exceptions, but (1) not all of them and (2) language
implementations besides C do exist, as the other thread attests. And it's
actually pretty tricky to hide __builtin_setjmp behind an interface
boundary, since the calling function can't exit without invalidating the
buffer; you have to do it with callbacks.

It does seem that Ruby, at least, keeps this internal to its core
implementation.

(because you already need to know it can only be used with
__builtin_longjmp and not longjmp).

That's a general thing with setjmp APIs. There are actually three
different setjmp APIs in UNIX with buffer types that are typically
interchangeable as arguments if you ask the C type system.

In general, I think it's reasonable for users to have strong expectations
about __builtin_setjmp / __builtin_longjmp. Targets should either
implement this in a stable and ideally GCC-compatible way (and then
consider the layout of a __builtin_jmp_buf to be ABI, and maybe we should
put that in one of our compiler headers), or the frontend shouldn't claim
to support it. It's not like this is a core language feature; it's
completely acceptable to just not provide the builtins on certain targets
that don't support it.

Even if we didn't care about GCC compatibility, having a default lowering
to setjmp/longjmp makes it harder to adopt a better implementation in the
future just within Clang, because it would basically be an ABI break
between Clang v.X and Clang v.Y.

John.

From: "John McCall" <rjmccall@gmail.com>
To: "Hal Finkel" <hfinkel@anl.gov>
Cc: "Joerg Sonnenberger" <joerg@britannica.bec.de>, "hans"
<hans@hanshq.net>, "cfe-dev@cs.uiuc.edu Developers"
<cfe-dev@cs.uiuc.edu>
Sent: Tuesday, March 3, 2015 6:45:56 PM
Subject: Re: Are __builtin_setjmp / __builtin_longjmp part of the
ABI?

> > From: "John McCall" < rjmccall@gmail.com >

> > To: "Hal Finkel" < hfinkel@anl.gov >

> > Cc: "Joerg Sonnenberger" < joerg@britannica.bec.de >, "<
> > hans@hanshq.net >" < hans@hanshq.net >, "cfe-commits"

> > < cfe-commits@cs.uiuc.edu >

> > Sent: Tuesday, March 3, 2015 2:05:53 AM

> > Subject: Re: r230255 - Only lower __builtin_setjmp /
> > __builtin_longjmp to

> >

> > >

> > >> From: "Joerg Sonnenberger" < joerg@britannica.bec.de >

> > >> To: "John McCall" < rjmccall@gmail.com >

> > >> Cc: hans@hanshq.net , "cfe-commits" < cfe-commits@cs.uiuc.edu
> > >> >

> > >> Sent: Tuesday, March 3, 2015 1:07:20 AM

> > >> Subject: Re: r230255 - Only lower __builtin_setjmp /

> > >> __builtin_longjmp to

> > >>

> > >>> This patch is pretty scary. __builtin_setjmp/longjmp are

> > >>> definitely not

> > >>> just libc functions with a __builtin_ prefix attached. They
> > >>> do

> > >>> not

> > >>> interoperate with setjmp/longjmp and expect a significantly

> > >>> smaller

> > >>> buffer,

> > >>> so silently rewriting them to setjmp/longjmp is ABI-breaking.

> > >>> This

> > >>> might

> > >>> fix Ruby, but only if Ruby is actually passing a full
> > >>> jmp_buf,

> > >>> and

> > >>> only if

> > >>> everything that does a __builtin_setjmp/__builtin_longjmp is

> > >>> recompiled in

> > >>> a way that does the rewrite. I'm very concerned about this

> > >>> introducing ABI

> > >>> problems for a Clang-compiled Ruby with GCC-compiled
> > >>> extensions

> > >>> or

> > >>> vice-versa. FWIW, Ruby seems to already have target-specific

> > >>> configuration

> > >>> logic for when to use them.

> > >>

> > >> Ruby has no target-specific configuration.

> >

> > Literally the first google result for "Ruby __builtin_setjmp" is
> > a

> > core commit turning it off for a target. Am I misunderstanding

> > something?

> >

> > GCC implements this builtin with much more generality than we do,

> > probably because GCC predates widespread adoption of libUnwind
> > and

> > we don't. Hal's comment about not trying to match GCC on PPC
> > aside

> > (and I find that comment pretty troubling!),

> Hi John,

> Fair enough; let's move this question of whether __builtin_setjmp /
> __builtin_longjmp are part of the ABI to a separate thread.

Thanks.

> When implementing these for the PowerPC backend, I followed the
> same
> scheme as is used by the X86 backend, and so while this is likely
> compatible with GCC, I did not consider GCC compatibility a
> specific
> goal. Speed and correctness were all I cared about.

> My rational for not considering them to be part of the ABI is that:

> 1. They don't appear in any system header

> 2. They don't appear in the system ABI documentation

> 3. They're not universally supported by all compilers targeting the
> architecture

I think the crucial question is whether there's a reasonable
assumption of interoperation with the compilers that do implement
them.

I don't think that there is, but in every use case I've come across, as you mention below with the Ruby case, the uses of these builtins is only an implementation detail internal to some library. It is very possible that some might disagree, and if so, I want to know that.

I think there's an expectation that these functions have a particular
straightforward and efficient implementation; lowering them to
less-efficient and incompatible functions that the user could simply
have called themselves seems pointless.

I agree. And as you pointed out below, we should try to be ABI compatible with ourselves, and we might wish to provide an optimized implementation at some point in the future.

> Generically, our implementation of __builtin_* (__builtin_<math
> >
> for example) differs in observable ways from GCC's implementation.

That's true, but I think this is distinguishable.

Fair enough.

> This does mean that passing a jmp_buf filled in by __builtin_setjmp
> across an interface boundary is a bad idea, but I never would have
> considered that to be a good idea anyway

It's necessary if you want to interoperate with e.g. an exception or
continuation mechanism that's specified to use __builtin_setjmp or
__builtin_longjmp. Most UNIX-like platforms these days use DWARF
unwinding for things like C++ exceptions, but (1) not all of them
and

But when used for C++ exceptions, then they definitively are part of the ABI. The question is really are they part of the ABI otherwise?

(2) language implementations besides C do exist, as the other thread
attests. And it's actually pretty tricky to hide __builtin_setjmp
behind an interface boundary, since the calling function can't exit
without invalidating the buffer; you have to do it with callbacks.

It does seem that Ruby, at least, keeps this internal to its core
implementation.

> (because you already need to know it can only be used with
> __builtin_longjmp and not longjmp).

That's a general thing with setjmp APIs. There are actually three
different setjmp APIs in UNIX with buffer types that are typically
interchangeable as arguments if you ask the C type system.

In general, I think it's reasonable for users to have strong
expectations about __builtin_setjmp / __builtin_longjmp. Targets
should either implement this in a stable and ideally GCC-compatible
way (and then consider the layout of a __builtin_jmp_buf to be ABI,
and maybe we should put that in one of our compiler headers)

Okay. I agree, at least, that we likely should not differ from GCC's implementation without some rationale.

, or the frontend shouldn't claim to support it. It's not like this
is a core language feature; it's completely acceptable to just not
provide the builtins on certain targets that don't support it.

Agreed.

Even if we didn't care about GCC compatibility, having a default
lowering to setjmp/longjmp makes it harder to adopt a better
implementation in the future just within Clang, because it would
basically be an ABI break between Clang v.X and Clang v.Y.

Agreed.

Thanks again,
Hal