#define behavior

Here's an interesting question that occurred in one of the OCCC entries that I was compiling with clang. This program:

$ cat d.c
#define x =

int foo(int *k) {
   return *k *x 37;
}

Generates this:

$ clang -ast-print d.c -triple=i686-apple-darwin9
typedef char *__builtin_va_list;
d.c:4:14: error: expected expression
   return *k *x 37;
              ^

int foo(int *k) {
}

1 diagnostic generated.

GCC also balks at this. It appears that there's a space being inserted after the "*" and before the "=" that "x" becomes. Is this expected?

-bw

Bill Wendling wrote:-

Here's an interesting question that occurred in one of the OCCC
entries that I was compiling with clang. This program:

I would hope OCCC entries would be valid C.

$ cat d.c
#define x =

int foo(int *k) {
   return *k *x 37;
}

Generates this:

$ clang -ast-print d.c -triple=i686-apple-darwin9
typedef char *__builtin_va_list;
d.c:4:14: error: expected expression
   return *k *x 37;
              ^

int foo(int *k) {
}

1 diagnostic generated.

GCC also balks at this. It appears that there's a space being
inserted after the "*" and before the "=" that "x" becomes. Is this
expected?

Standard C works on tokens, not text, so the code is invalid as
* and = are two separate tokens.

Neil.

> Here's an interesting question that occurred in one of the OCCC
> entries that I was compiling with clang. This program:

I would hope OCCC entries would be valid C.

> $ cat d.c
> #define x =
>
> int foo(int *k) {
> return *k *x 37;
> }
>
> Generates this:
>
> $ clang -ast-print d.c -triple=i686-apple-darwin9 typedef char
> *__builtin_va_list;
> d.c:4:14: error: expected expression
> return *k *x 37;
> ^
>
> int foo(int *k) {
> }
>
> 1 diagnostic generated.
>
> GCC also balks at this. It appears that there's a space
being inserted
> after the "*" and before the "=" that "x" becomes. Is this expected?

Standard C works on tokens, not text, so the code is invalid as
* and = are two separate tokens.

The preprocessor works on preprocessor tokens (translation phase 4), which
are different from the tokens the compiler is working on. The preprocessor
tokens are converted into tokens at translation phase 7 only. Any
preprocessor token not matching the syntax of a token is an error at this
point.

Further, IIUC concatenating two tokens (using ##) forming another (invalid)
pp token results in undefined behaviour in C90/C++ but is allowed in
C99/C++09. Gcc issues a warning here (I'm not sure about clang) for C90/C++.

Juxtapositioning two preprocessor tokens generally doesn't create a new
preprocessor token. So the two preprocessor tokens '*' and '=' are converted
to tokens separately (in translation step 7), resulting in the syntax error
above.

When the textual representation of the preprocessed input is to be generated
(i.e. gcc -E) all good preprocessors _insert_ additional whitespace between
two preprocessor tokens, which otherwise would form another token based on
juxtaposition, just to avoid creating the syntax of a different token from
juxtapositioning two preprocessor tokens.

IMHO, gcc and clang (and FWIW, wave) behave correctly.

Regards Hartmut

Hartmut Kaiser wrote:-

> Standard C works on tokens, not text, so the code is invalid as
> * and = are two separate tokens.

The preprocessor works on preprocessor tokens (translation phase 4), which
are different from the tokens the compiler is working on. The preprocessor
tokens are converted into tokens at translation phase 7 only. Any
preprocessor token not matching the syntax of a token is an error at this
point.

Well, is undefined behaviour. The standard has no concept of error per
se, only "must be diagnosed" and "failure to successfully translate",
the latter applying to #error exclusively.

Further, IIUC concatenating two tokens (using ##) forming another (invalid)
pp token results in undefined behaviour in C90/C++ but is allowed in
C99/C++09. Gcc issues a warning here (I'm not sure about clang) for C90/C++.

There is no difference between C90 and C99 here. I'm the person that
made it a hard error in GCC; it is not a warning. That is and was my
preference. :slight_smile:

Juxtapositioning two preprocessor tokens generally doesn't create a new
preprocessor token. So the two preprocessor tokens '*' and '=' are converted
to tokens separately (in translation step 7), resulting in the syntax error
above.

When the textual representation of the preprocessed input is to be generated
(i.e. gcc -E) all good preprocessors _insert_ additional whitespace between
two preprocessor tokens, which otherwise would form another token based on
juxtaposition, just to avoid creating the syntax of a different token from
juxtapositioning two preprocessor tokens.

IMHO, gcc and clang (and FWIW, wave) behave correctly.

Absolutely. I was just expressing it in plain English.

Neil.

Neil,

> > Standard C works on tokens, not text, so the code is invalid as
> > * and = are two separate tokens.
>
> The preprocessor works on preprocessor tokens (translation
phase 4),
> which are different from the tokens the compiler is working on. The
> preprocessor tokens are converted into tokens at
translation phase 7
> only. Any preprocessor token not matching the syntax of a
token is an
> error at this point.

Well, is undefined behaviour. The standard has no concept of
error per se, only "must be diagnosed" and "failure to
successfully translate", the latter applying to #error exclusively.

Thanks for clarifying this.

> Further, IIUC concatenating two tokens (using ##) forming another
> (invalid) pp token results in undefined behaviour in C90/C++ but is
> allowed in C99/C++09. Gcc issues a warning here (I'm not
sure about clang) for C90/C++.

There is no difference between C90 and C99 here. I'm the
person that made it a hard error in GCC; it is not a warning.
That is and was my preference. :slight_smile:

Ok, but it is allowed in C99, and undefined behaviour in C++98. Admittedly,
I was not sure about C90.

Regards Hartmut

Hartmut Kaiser wrote:-

> > Further, IIUC concatenating two tokens (using ##) forming another
> > (invalid) pp token results in undefined behaviour in C90/C++ but is
> > allowed in C99/C++09. Gcc issues a warning here (I'm not
> sure about clang) for C90/C++.
>
> There is no difference between C90 and C99 here. I'm the
> person that made it a hard error in GCC; it is not a warning.
> That is and was my preference. :slight_smile:

Ok, but it is allowed in C99, and undefined behaviour in C++98. Admittedly,
I was not sure about C90.

I don't understand what you mean "is allowed". An implementation
can choose to make undefined behaviour allowed in any particular
case, but invalid pasting is not required to be accepted by any of
the standards; it is undefined in all (or can you quote
something to the contrary?)

C++'s preprocessor is essentially C90's with a couple of tweaks
for booleans and UCNs; nothing else changes.

Neil.

Neil,

> > > Further, IIUC concatenating two tokens (using ##)
forming another
> > > (invalid) pp token results in undefined behaviour in
C90/C++ but
> > > is allowed in C99/C++09. Gcc issues a warning here (I'm not
> > sure about clang) for C90/C++.
> >
> > There is no difference between C90 and C99 here. I'm the person
> > that made it a hard error in GCC; it is not a warning.
> > That is and was my preference. :slight_smile:
>
> Ok, but it is allowed in C99, and undefined behaviour in C++98.
> Admittedly, I was not sure about C90.

I don't understand what you mean "is allowed". An
implementation can choose to make undefined behaviour allowed
in any particular case, but invalid pasting is not required
to be accepted by any of the standards; it is undefined in
all (or can you quote something to the contrary?)

I had to re-read both standards, and you're right. I'm sorry for the noise,
somehow I was thinking it's not undefined behaviour in C99.

C++'s preprocessor is essentially C90's with a couple of tweaks
for booleans and UCNs; nothing else changes.

Ok, good to know. I don't have a C90 standards document available.

Regards Hartmut