Function pointer is compile-time constant when cast to long but not int?

Hi,

while trying to build some fairly low-level code from an embedded
platform, I came across a case where a function pointer was cast to
int and used in a structure initializer. Turns out clang and gcc both
reject this, but not if it's cast to long (see shell transcript
below). I realize this is undefined behaviour, but is there a reason
to handle two casts differently?

Best Regards
Magnus Reftel

$ cat fp_int.c
struct s {
        long i;
};
void f(void);
static struct s my_s = {(int)f};
$ clang -Wall -Werror -c fp_int.c -Wno-pointer-to-int-cast -Wno-unused-variable
fp_int.c:5:24: error: initializer element is not a compile-time constant
static struct s my_s = {(int)f};
                       ^~~~~~~~
1 error generated.
$ cat fp_long.c
struct s {
        long i;
};
void f(void);
static struct s my_s = {(long)f};
$ clang -Wall -Werror -c fp_long.c -Wno-pointer-to-int-cast -Wno-unused-variable
$

I assume that sizeof(int) < sizeof(void(*)()) == sizeof(long) on
your target. The problem is that the tool chain almost certainly
can't express a truncated address as a relocation.

C only requires the implementation to support initializer values that
are either (1) constant binary data, (2) the address of some object, or
(3) or an offset added to the address of some object. We're allowed,
but not required, to support more esoteric things like subtracting two
addresses or multiplying an address by a constant or, as in your
case, truncating the top bits of an address away. That kind of
calculation would require support from the entire tool chain from
assembler to loader, including various file formats along the way.
That support generally doesn't exist.

For example, I don't believe that any of the common object file
formats support a target-independent "take the value of this symbol,
truncate it down to 16 or 32 bits, and store it here" relocation. So
the only way we could implement that kind of thing would be to
emit code to do it at load-time. C++ lets us do that, but C really
discourages it.

Some targets do provide very specific masking relocations. SPARC,
for example, has relocations which work with only the bottom 10 bits
of an address; they're useful for modifying instructions to load
address constants. But compilers don't usually expose this to users
because it would be really weird: (((int) &x) & 10) would be a legal
constant initializer but (((int) &x) & 9) would not? Who would benefit
from that who wouldn't be better off just writing assembly?

John.