K&R style argument lists and the type system

Consider the following testcase:
int a() {return 0;}
int b() {return a(1);}

Calling b has undefined behavior per C99, but no diagnostics are required.

Consider the following testcase:
int a(int);
int a() {return 0;}
This is a constraint violation per C99.

We currently map a() to FunctionTypeNoProto, and this has roughly the
right behavior when we don't merge types. However, by adding merge
types, we conclude that a() has type a(int), which is clearly wrong,
and leads to crashes trying to access non-existent parameter
declarations. But if we instead map a() to a FunctionTypeProto with
no parameters, we incorrectly error out on the first example.

And actually, we already have this problem for cases like the
following, which are less common:
int a(x,y) int x,y; {return x+y;}
int b() {return a(1,2,3);}

Any ideas for how to solve this? We could always map definitions
identifier-lists to FunctionTypeNoProto, but that means a good chunk
of additional code to check type-merging doesn't do bad stuff to a
definition and allow CodeGen to synthesize the type of such a function
definition properly. Another possibility would be to add a flag to
FunctionTypeProto which signifies whether the type comes from an
identifier list; however, I have no idea where to add such a flag.

-Eli

Consider the following testcase:
int a() {return 0;}
int b() {return a(1);}

Calling b has undefined behavior per C99, but no diagnostics are required.

Consider the following testcase:
int a(int);
int a() {return 0;}
This is a constraint violation per C99.

We currently map a() to FunctionTypeNoProto, and this has roughly the
right behavior when we don't merge types. However, by adding merge
types, we conclude that a() has type a(int), which is clearly wrong,
and leads to crashes trying to access non-existent parameter
declarations. But if we instead map a() to a FunctionTypeProto with
no parameters, we incorrectly error out on the first example.

And actually, we already have this problem for cases like the
following, which are less common:
int a(x,y) int x,y; {return x+y;}
int b() {return a(1,2,3);}

Any ideas for how to solve this? We could always map definitions
identifier-lists to FunctionTypeNoProto, but that means a good chunk
of additional code to check type-merging doesn't do bad stuff to a
definition and allow CodeGen to synthesize the type of such a function
definition properly.

This makes more sense to me, since there is no prototype for a(x,y).

I think it's better to keep the old style function definition separate from the modern syntax/semantics...

snaroff

Eli Friedman wrote:-

Consider the following testcase:
int a() {return 0;}
int b() {return a(1);}

Calling b has undefined behavior per C99, but no diagnostics are required.

Consider the following testcase:
int a(int);
int a() {return 0;}
This is a constraint violation per C99.

We currently map a() to FunctionTypeNoProto, and this has roughly the
right behavior when we don't merge types. However, by adding merge
types, we conclude that a() has type a(int), which is clearly wrong,
and leads to crashes trying to access non-existent parameter
declarations. But if we instead map a() to a FunctionTypeProto with
no parameters, we incorrectly error out on the first example.

What type are you giving a()? Since it doesn't have a well-defined
type according to the standard, I would have thought "erroneous type"
or somesuch would be appropriate. Assuming you don't wish to give
meaning to all erroneous constructs (some you might as an extension
or relaxation of course) those without meaning must be flagged as
not having concrete meaning in some way to prevent later bogus
diagnostics and analyses, if nothing else.

Neil.

The issue is that we aren't detecting it as an error at the moment,
and it's not completely clear where the best place is to check for it.

-Eli

Bleh, I think we have to encode the exact signature into the type
system somehow. Consider the following testcase:

int a(x) float x; {return x;}
int b(int x) {return x;}
int c(int x) {return (x ? a : b)(1);}

This is a constraint violation per C99, which would be completely
unintuitive to diagnose without encoding the fact that "a" takes float
into the type.

gcc apparently screws up this case; it doesn't print a warning even
with -std=c99 -pedantic. Although, the standard is a bit screwy here.

-Eli

Are you sure this is a constraint violation? Of what rule?

-Chris

Consider the following testcase:
int a() {return 0;}
int b() {return a(1);}

Calling b has undefined behavior per C99, but no diagnostics are required.

Right. I think that we should type functions with no arguments or with an identifier list as FunctionTypeNoProto *always*. To issue the (not required but prefered) diagnostic in this case, we should do two checks:

1) is the callee compatible with the argument list (yes)
2) if the callee has a FunctionTypeNoProto type, and if it's body is available, look at the body to see how many arguments it really had, does it match up with the call site (no)

Consider the following testcase:
int a(int);
int a() {return 0;}
This is a constraint violation per C99.

Right, the sequence of events should be:
1) parse the declaration, "a" gets type "int(int)"
2) parse the definition which has type "int()"
3) merge the two types, giving "int(int)"
4) since it is a definition, verify that the actual argument list matches the merged type.

We currently map a() to FunctionTypeNoProto, and this has roughly the
right behavior when we don't merge types. However, by adding merge
types, we conclude that a() has type a(int), which is clearly wrong,
and leads to crashes trying to access non-existent parameter
declarations. But if we instead map a() to a FunctionTypeProto with
no parameters, we incorrectly error out on the first example.

I think that the problem basically boils down to the fact that we handle these:

int a() {}
int b(x) int x; {}

as FunctionTypeProto instead of FunctionTypeNoProto. Even though we can "see" that 'a' takes no arguments, C doesn't let us take this into consideration for its type. These should both really be FunctionTypeNoProto.

And actually, we already have this problem for cases like the
following, which are less common:
int a(x,y) int x,y; {return x+y;}
int b() {return a(1,2,3);}

Any ideas for how to solve this? We could always map definitions
identifier-lists to FunctionTypeNoProto,

Yep, I agree with Steve that this is the best approach.

but that means a good chunk
of additional code to check type-merging doesn't do bad stuff to a
definition and allow CodeGen to synthesize the type of such a function
definition properly.

I'm not sure what you mean by this?

Another possibility would be to add a flag to
FunctionTypeProto which signifies whether the type comes from an
identifier list; however, I have no idea where to add such a flag.

I think it would be better to make the compatibility check handle FunctionTypeNoProto to see if it has a definition. As "value add", it could look at the body to see if the body has arguments that are incompatible with a call.

-Chris

Chris Lattner wrote:-

> Bleh, I think we have to encode the exact signature into the type
> system somehow. Consider the following testcase:
>
> int a(x) float x; {return x;}
> int b(int x) {return x;}
> int c(int x) {return (x ? a : b)(1);}
>
> This is a constraint violation per C99, which would be completely
> unintuitive to diagnose without encoding the fact that "a" takes float
> into the type.
>
> gcc apparently screws up this case; it doesn't print a warning even
> with -std=c99 -pedantic. Although, the standard is a bit screwy here.

Are you sure this is a constraint violation? Of what rule?

My front end gives

"/tmp/bug.c", line 3: error: expressions of types "int (*)()" and
  "int (*)(int)" cannot be used together in a conditional expression
int c(int x) {return (x ? a : b)(1);}
                            ^

Which maybe gives a clue :slight_smile:

Neil.

Ok, so this is simple and consistent. The type of 'a' is int().

Incidentally Neil, it is somewhat strange that you warn about types after promotions. The "true" part of the expression has type "int()" not "int(*)()". The user isn't using an expression of type "int(*)()" with the conditional expression, they are using a raw function.

-Chris

C99 6.7.5.3p15:
"For two function types to be compatible [...] If one type has a
parameter type list and the other type is specified by a function
definition that contains a (possibly empty) identifier list, both
shall agree in the number of parameters, and the type of each
prototype parameter shall be compatible with the type that results
from the application of the default argument promotions to the type of
the corresponding identifier."

-Eli

Chris Lattner wrote:-

Are you sure this is a constraint violation? Of what rule?

My front end gives

"/tmp/bug.c", line 3: error: expressions of types "int (*)()" and
  "int (*)(int)" cannot be used together in a conditional expression
int c(int x) {return (x ? a : b)(1);}
                           ^

Which maybe gives a clue :slight_smile:

Ok, so this is simple and consistent. The type of 'a' is int().

Incidentally Neil, it is somewhat strange that you warn about types
after promotions. The "true" part of the expression has type "int()"
not "int(*)()". The user isn't using an expression of type "int(*)()"
with the conditional expression, they are using a raw function.

I don't see any value in referring to original types and it may
even engender confusion with the user thinking the compiler is
giving an error because it is missing some of the necessary semantic
analysis rather than because there is a type mismatch.

Comeau gives a very similar diagnostic with the same types stated. The
decayed types are the ones that the semantics apply to.

Neil.